Master R2 Internship in Natural Language Processing: weakly supervised learning for hate speech detection

Supervisors: Irina Illina, MdC, Dominique Fohr, CR CNRS

Team: Multispeech, LORIA-INRIA


Duration: 5-6 months

Deadline to apply : March 1th, 2020

Required skills: background in statistics, natural language processing and computer program skills (Perl, Python). Candidates should email a detailed CV with diploma

Motivations and context

Recent years have seen a tremendous development of Internet and social networks. Unfortunately, the dark side of this growth is an increase in hate speech. Only a small percentage of people use the Internet for unhealthy activities such as hate speech. However, the impact of this low percentage of users is extremely damaging.

Hate speech is the subject of different national and international legal frameworks. Manual monitoring and moderating the Internet and the social media content to identify and remove hate speech is extremely expensive. This internship aims at designing methods for automatic learning of hate speech detection systems on the Internet and social media data. Despite the studies already published on this subject, the results show that the task remains very difficult (Schmidt et al., 2017; Zhang et al., 2018).

In text classification, text documents are usually represented in some so-called vector space and then assigned to predefined classes through supervised machine learning. Each document is represented as a numerical vector, which is computed from the words of the document. How to numerically represent the terms in an appropriate way is a basic problem in text classification tasks and directly affects the classification accuracy. Developments in Neural Network led to a renewed interest in the field of distributional semantics, more specifically in learning word embeddings (representation of words in a continuous space). Computational efficiency was one big factor which popularized word embeddings. The word embeddings capture syntactic as well as semantic properties of the words (Mikolov et al., 2013). As a result, they outperformed several other word vector representations on different tasks (Baroni et al., 2014).

Our methodology in the hate speech detection is related on the recent approaches for text classification with Neural Networks and word embeddings. In this context, fully connected feed forward networks, Convolutional Neural Networks (CNN) and also Recurrent/Recursive Neural Networks (RNN)  have been applied. On the one hand, the approaches based on CNN and RNN capture rich compositional information, and have outperformed the state-of-the-art results in text classification; on the other hand they are computationally intensive and require huge corpus of training data.

To train these DNN hate speech detection systems it is necessary to have a very large corpus of training data. This training data must contains several thousands of social media comments and each comment should be labeled as hate or not hate. It is easy to automatically collect social media and Internet comments. However, it is time consuming and very costly to label huge corpus. Of course, for several hundreds of comments this work can be manually performed by human annotators. But it is not feasible to perform this work for a huge corpus of comments. In this case weakly supervised learning can be used : the idea is to train a deep neural network with a limited amount of labelled data.

The goal of this master internship is to develop a methodology to weakly supervised learning of a hate speech detection system using social network data (Twitter, YouTube, etc.).


In our Multispeech team, we developed a baseline system for automatic hate speech detection. This system is based on fastText and BERT embeddings (Bojanowski  et al., 2017; Devlin et al, 2018) and the methodology of CNN/RNN. During this internship, the master student will work on this system in following directions:

  • Study of the state-of-the-art approaches in the field of weakly supervised learning;
  • Implementation of a baseline method of weakly supervised learning for our system;
  • Development of a new methodology for weakly supervised learning. Two cases will be studied. In the first case, we train the hate speech detection system using a small labeled corpus. Then, we proceed incrementally. We use this first system to label more data, we retrain the system and use it to label new data, In the second case, we refer to learning with noisy labels (labels that can be not correct or given by several annotators who do not agree).


Baroni, M., Dinu, G., and Kruszewski, G.  “Don’t count, predict! a systematic comparison of context-counting vs. contextpredicting semantic vectors”. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 1, pages 238-247, 2014.

Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. “Enriching word vectors with subword information”. Transactions of the Association for Computational Linguistics, 5:135–146, 2017.

Dai, A. M. and Le, Q. V. “Semi-supervised sequence Learning”. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., and Garnett, R., editors, Advances in Neural Information Processing Systems 28, pages 3061-3069. Curran Associates, Inc, 2015.

Devlin J.,   Chang M.-W., Lee K., Toutanova K. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, arXiv:1810.04805v1, 2018.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. “Distributed representations of words and phrases and their Compositionality”. In Advances in Neural Information Processing Systems, 26, pages 3111-3119. Curran Associates, Inc, 2013b.

Schmidt A., Wiegand M. “A Survey on Hate Speech Detection using Natural Language Processing”, Workshop on Natural Language Processing for Social Media, 2017.

Zhang, Z., Luo, L. “Hate speech detection: a solved problem? The Challenging Case of Long Tail on Twitter”., 2018.