The MLNS2 project aims to adequately design and investigate efficient approaches to fight against simbox frauds and malware proliferation. Addressing such challenges requires multidisciplinary knowledge such as Machine Learning, Network, System, and Security.
Smartphones become the most natural tool for people to communicate, and as such, it involves more than five billion people around the globe. As a consequence, cellular networks are an infrastructure of prime importance, so does the software ecosystem related to smartphones. In this context, we identified two major challenges:
1- Attacks at the infrastructure level.
Telephone fraud presents a considerable problem for Mobile Network Operators (MNO) around the world. According to the CFCA, the global fraud loss is estimated to USD 28.3 Billion in 2019. Illegal bypass termination also known as Simbox fraud is by far one of the most prevalent frauds affecting the telecommunication market. In many countries, the international termination rate (ITR) is considerably higher than the local (retail) termination rate (LTR) within the country (e.g., up to 2.8 − 28 times of difference in Cameroon). This makes it profitable for fraudsters to bypass the regular interconnect operator when terminating calls in the country as they can pay the lower local rate instead of the ITR. Simbox fraud is a major problem in developing countries. Besides, in some countries, as much as 70% of incoming international call traffic is terminated fraudulently. Simbox fraud is a very tricky problem due to the following aspects: it involves major stakeholders in the telephony area which could be fraudulent; it is related to economic factors for internal development which it would be difficult for developing countries to do without; it is accentuated by the increasing popularisation among mobile subscribers of VoIP applications for international telephone calls. As in most security problems, fraudsters are evolving at the same rate, if not faster, than the research community. The use of AI advances (for instance to mimic human behavior in frauds) has attracted considerable attention because it hardens the identification of fraudulent calls.
2- Attacks at the software ecosystem level.
Android is now the most used operating system with an 86% market share. Thanks to an active developer community, the application ecosystem gets bigger every day. For example, Google Play Store holds 3.3 million applications with a rate of more than 50–000 submissions a month. Estimations indicate that more than 75 billion applications were downloaded on the platform in 2016. Consequently, due to its widespread popularity, the Android platform has become a lucrative target for hackers. Hence Android constitutes one of the first-choice platforms to propagate malware threats. The infection rate on Android devices is constantly increasing spawned out by a dramatic proliferation of malware 3. In the last decades, we have witnessed tremendous activity around designing anti-virus/anti-malware scanners dedicated to mobile platforms. Indeed, if traditional techniques based on signatures used to detect malware are efficient for already known malware, they struggle to deal with new ones. To fight against these security threats, machine learning techniques have witnessed a steady adoption in the design of anti-malware scanners to improve their detection rates. However, to detect any malware, an anti-virus has to extract foremost a vector of features from Android applications that remains challenging.
- Addressing Simbox frauds: It includes deeply explore the system and the underlying mechanisms behind the Simbox by identifying its components, architecture, and fraud strategies. From this acquired knowledge about the simbox fraud ecosystem, we will then explore how to design a system to detect such frauds by leveraging machine learning, mathematics, and systems knowledge.
- Addressing malware proliferation via on-device antiviruses: The purpose is to build new-generation on-device scanners capable of analyzing malicious behavior on the fly. This includes leveraging reinforcement learning to determine from dynamically generated traces if a target application is malicious or not and if so to put it in quarantine. Additionally, we plan to explore new promising approaches to fight against O-days malware such as Mobile Crowdsourcing Analysis (MCA) which has its roots in the distributed systems and distributed machine learning domains.