Article from Claude Castelluccia on manipulation published in Le Monde binaire blog: here
The accelerated growth of the Internet has outpaced our abilities as individuals to maintain control of our personal data. The recent advent of personalized services has lead to the massive collection of personal data and the construction of detailed profiles about users. However, users have no information about the data which constitute its profile and how they are exploited by the different entities (Internet companies, telecom operators, …). This lack of transparency gives rise to ethical issues such as discrimination or unfair processing. In this associate team, we propose to strengthen the complementary nature and the current collaborations between the Inria Privatics group and UQAM to advance research and understanding on data and the algorithmic transparency and accountability.
29 janvier 2018 : Ouverture du MOOC “Protection de la vie privée dans le monde numérique”, par Cédric Laradoux et Vincent Roca
Librement accessible sur la plateforme FUN du 29/01 au 18/03, ce MOOC abordera la notion de données personnelles et la législation associée, la protection de ses données et de son identité numérique, les risques en termes de vie privée associés à l’usage des smartphones, et finalement la protection de sa messagerie.
Boosted by recent legislations, data anonymization is fast becoming a norm. However, as of yet no generic
solution has been found to safely release data. As a consequence, data custodians often resort to ad-hoc means to anonymize datasets. Both past and current practices indicate that hashing is often believed to be an effective way to anonymize data. Unfortunately, in practice it is only rarely effective. In , we expose the limits of cryptographic hash functions as an anonymization technique. Anonymity set is the best privacy model that can be achieved by hash functions. However, this model has several shortcomings. We provide three case studies to illustrate how hashing only yields a weakly anonymized data. The case studies include MAC and email address anonymization as well as the analysis of Google Safe Browsing.
Generative models are used in a wide range of applications building on large amounts of contextually rich
information. Due to possible privacy violations of the individuals whose data is used to train these models,
however, publishing or sharing generative models is not always viable. In , we develop a novel technique
for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data with a mixture of k generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into k clusters, using a novel differentially private kernel k-means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset, as well as call detail records and transit datasets, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.
With the adoption of the EU General Data Protection Regulation (GDPR), conducting a data protection impact assessment will become mandatory for certain categories of personal data processing. A large body of literature has been devoted to data protection impact assessment and privacy impact assessment. However, most of these papers focus on legal and organizational aspects and do not provide many details on the technical aspects of the impact assessment, which may be challenging and time consuming in practice. The general objective of  was to fill this gap and to propose a methodology which can be applied to conduct a privacy risk analysis in a systematic way, to use its results in the architecture selection process (following the privacy by design approach and to re-use its generic part for different products or deployment contexts. The proposed analysis proceeds in three broad phases: (1) a generic privacy risk analysis phase which depends only on the specifications of the system and yields generic harm trees; (2) an architecture-based privacy risk analysis which takes into account the definitions of the possible architectures of the system and refines the generic harm trees into architecture-specific harm trees. (3) a context-based privacy risk analysis which takes into account the context of deployment of the system (e.g., a casino, an office cafeteria, a school) and further refines the architecture-specific harm trees into context-specific harm trees. Context-specific harm trees can be used to take decisions about the most suitable architectures.
We are organizing a workshop on data transparency at Lyon on April 23. More information here.
Users carrying a mobile device with their Wi-Fi enabled are exposed to unsollicited Wi-Fi tracking in the physical world. Disabling Wi-Fi on one’s phone seems to be a solution to escape such tracking. Some Wi-Fi trackers even suggest users who don’t want to be track to turn off Wi-Fi.
In fact, disabling Wi-Fi on Android is not necessarily enough to avoid tracking. Indeed, devices may still perform Wi-Fi scans even if the Wi-Fi has been disabled. Our study confirms that other settings need to be configured in order to totally mute the Wi-Fi interface. In particular, a parameter called “Always allow scanning”, which can’t be turned-off on some devices, needs to be deactivated to prevent this behavior.
Les utilisateurs d’un appareil équipé du Wi-Fi sont potentiellement exposés au traçage Wi-Fi dans le monde physique. La désactivation du Wi-Fi semble être une solution pour échapper à ce traçage. Certains traceurs Wi-Fi suggèrent même de d’éteindre le Wi-Fi si l’on ne souhaite pas être géolocalisé.
En fait, la désactivation du Wi-Fi sous Android n’est pas nécessairement suffisante pour empêcher le traçage. En effet, l’appareil peut effectuer des scans Wi-Fi même si le Wi-Fi a été désactivé. Notre étude confirme que d’autres paramètres doivent être configurés afin de bloquer totalement l’émission de signaux Wi-Fi. En particulier, l’option “recherche toujours disponible”, qui n’est pas accessible sur certains appareils, doit être désactivée pour empêcher ce comportement.
Célestin Matte, Mathieu Cunche, Vincent Toubiana. Does disabling Wi-Fi prevent my Android phone from sending Wi-Fi frames?. [Research Report] RR-9089, Inria – Research Centre Grenoble – Rhône-Alpes; INSA Lyon. 2017 <hal-01575519>
Stop tracking our kids!
Mattel et Hasbro sont épinglés aux Etats-Unis après le pistage d’enfants en ligne….car il est interdit de “tracker” et “profiler” les enfants aux Etats-Unis… ainsi qu’en Europe et en France (sans le consentement des parents) (…
J’ai fait donc fait un test rapide sur certains des 10 sites pour enfants recommandés par le site “memoclic.com” et les résultats sont édifiants!
- Dessins, coloriages, bricolages, jeux sur jedessine.com. “Voilà une référence en matière d’activités et de jeux en ligne pour enfants. Malgré les campagnes de publicité un peu envahissantes (il faut bien vivre en même temps), vous trouverez assez vite votre chemin jusqu’au dessins en linge, coloriages (en ligne ou à imprimer), des activités de bricolage, des jeux, des lectures ou des bonus multimédias… bref, un véritable portail ludo-éducatif qui s’adressent aussi bien aux tout petits qu’aux un peu plus grands. “… ce site contient au moins 5 trackers.
- Un journal d’actualités pour votre enfant.”De votre côté, vous avez LeMonde, Libe, LeFigaro, 20minutes… et bien d’autres journaux encore. Mais existe t-il un journal d’actualités vraiment adapté à vos chères têtes blondes ? L’actualité expliquée aux enfants est donc disponible sur certains sites comme LesClesJunior ou le Journal des enfants. Vous serez sûr(e) que votre enfant ne tombera pas sur des images et des informations choquantes.”… votre enfant ne tombera pas sur des images et des informations choquantes, mais sur au moins 5 trackers!
C’est pas sorcier est sur le Web. “Quel est l’enfant qui ne connaît pas encore l’émission très bien faite C’est pas sorcier diffusée sur France 3 ? Au cas où ce programme ne vous dise rien, rendez-vous sur ce site web qui recense quelques vidéos intéressantes (pas toutes malheureusement) et qui expliquent fort bien les choses. Les sujets peuvent être très différents mais vous intéresseront sans aucun doute.”…de mieux en mieux: 25 trackers !!!!
Est-ce vraiment nécessaire de continuer ou j’appelle la CNIL?????
Back in 2014, in our ACM WiSec paper, entitled WifiLeaks: Underestimated Privacy Implications of the
More particularly, we found that 41% apps (out of a total of 2700 most popular apps tested) were accessing ACCESS_WIFI_STATE permission. It was very a surprising result because this permission gives access to Wi-Fi related information (MAC address, BSSID of connected access point, Wi-Fi scan info, etc.) and we could not imagine why so many apps would require to access this information. The only apps we could imagine requiring access to such information were Wi-Fi configuration apps or local area network based game apps for example.
By looking more closely at these Wi-Fi related information, we found that it may reveal a lot of personally identifiable information like a device unique identifier (MAC address), Wi-Fi scan info (to derive approximate geographic location), etc.
We then looked further into the apps and statically analysed all those apps that required ACCESS_WIFI_STATE permission. To our surprise, we found that apps in almost all the categories (even apps in wallpapers and comics category) are calling Java methods (that are protected by this permission) to access aforementioned Wi-Fi related information. Below is the figure from our paper which depicts the categories of apps requesting this information.
We dynamically analysed 88 apps (those apps which looked interesting to us based on the results from static analysis) to check if these apps really access and send this information over the Internet. We found that is was the case for the third party library of InMobi that was accessing Wi-Fi related data and sending it back to their servers (See figure below).
The result of the Wi-Fi Scan Info collected by InMobi can be used to derive the location of the device. Indeed, this information is the list of nearby Wi-Fi access points (identified by their BSSID) which can be used to obtain the geolocation of the device using trilateration techniques. This method is actually used by most mobile Operating System to obtain a geolocation without relying on the GPS, but the resulting geolocation information is always protected by the geolocation permission. However, we found that many of those apps who use InMobi are in fact not requesting the geolocation permission but surreptitiously computing it by abusing ACCESS_WIFI_STATE permission on Android.
Two years after our research was published, the Federal Trade Commission (FTC) reached a $950,000 settlement with InMobi for tracking millions of consumers’ locations, including children, without their knowledge. The FTC allege that InMobi abused the WiFi State information on the Android system to track the location of people without their consent, which is exactly what we showed in our research. Its policy prevents the FTC of releasing the sources of its investigations, therefore there is no way to affirm that our research triggered this investigation or was used during this investigation. We can only be sure that we identified a privacy issue that was serious enough to justify an investigation of the FTC and a penalty of $950,000. In fact, the penalty is actually 4M$ but the FTC is only asking 950K$ because the company would be bankrupt otherwise. In addition to this, the company is under surveillance for their privacy behaviour for the next 20 years.
Back in 2014, we also conducted a survey and found that users do not really understand the privacy implications of this ACCESS_WIFI_STATE permission. The permission looks innocuous but it is really not.
This is because a variety of private information could be derived using the data accessible from this permission (again, check our ACM WiSec paper). Android OS marks its protection level as ‘normal’ (even though location could be derived from the information accessible through this permission) whereas location permissions are marked as ‘dangerous’. As the permission description does not explicitly describe all the possible privacy implications and users do not really understand these privacy implications, we contacted Android security team to report our findings in 2014. They acknowledged reception of our mail and told us that our mail is forwarded to the Internal team. However, the permission is still marked as ‘normal’ and its description is not yet changed.
Update: Although the permission description has not been modified, the Android system has been modified to reduce the privacy issues. More specifically, the getScanResults method, which give access to the list of nearby Wi-Fi access points and thus the location, is now protected by location permissions. However the getConnectionInfo method, which exposes the device’s location through the identifiers of the currently connected network, is still only protected by Wi-Fi permissions and not by location permissions.
We hope this ruling will lead advertisement and analytics companies to think twice before abusively collecting sensitive information without being clear about the data collection.