Feedback on text analysis and emotion recognition in voice using deep learning

Speaker: Nicolas Turpault

Date: February 15, 2018


– During my internship in a startup in London I developed a system to try to recognise emotion in voice. In this work we used some speech processing (MFCC) and then applied a RNN (LSTM) to predict the emotion in voice. We used SEMAINE and Avec databases to do fully supervised learning because they are annotated databases of emotions (valence, arousal) in voice. The main constraint given by the startup was to be able to use it in a mobile application (reduced microphone and computing power).

– During my apprenticeship I developped a tool to support decision-making in a ticketing system. The goal was to be able to predict the team able to solve the ticket using text only. To achieve this task, I performed some preprocessing (including lemmatization) and then compared different architectures using a RNN (LSTM/BLSTM). I also compared the difference between an embedding layer in the network and the output of a trained word2vec as input of the RNN. Finally I showed the impact of finetuning on this specific task.

During the presentation I will briefly present problems and solutions proposed to be able to discuss with you interesting points.