Automatic analysis of user’s social cues during mediated communication

Speaker: Nathan Libermann Date: March 16, 2017 Abstract: In this talk I will present the results of my master internship. I propose to explore the social cues expressed by a user during a mediated communication either with an embodied conversational agent or with another human. For this purpose, I have exploited a machine learning method to …

Supervised group nonnegative matrix factorisation with similarity constraints and applications to speaker identification

Speaker: Romain Serizel Date: February 9, 2017 Abstract: This paper presents supervised feature learning approaches for speaker identification that rely on nonnegative matrix factorisation. Recent studies have shown that group nonnegative matrix factorisation and task-driven supervised dictionary learning can help performing effective feature learning for audio classification problems. This paper proposes to integrate a recent method …

Environmental low-rank multiway data mining

Speaker: Jérémy E. Cohen (Université de Mons) Date: January 16, 2017  

Time-frequency masking and Optimal Wiener filter for multichannel speech enhancement

Speaker: Ziteng Wang Date: December 8, 2016 Abstract: Time-frequency speech presence probability estimation or mask estimation is crucial in speech enhancement. It is especially the case in Multichannel Wiener Filter (MWF), of which the solution only relies on the second-order statistics of speech and noise. For the estimation methods, there has been a shift from experimental thresholding on multichannel features to …

Speaker Recognition: Current Challenges and Trends

Speaker: Dayana Ribas Date: November 3, 2016 Abstract: Currently there is an increasing interest in the development of technologies that integrate biometric systems due to its wide use in applications where the identification of individuals is required. In this context, Automatic Speaker Recognition Systems are in high demand from both commercial to security applications, and therefore …

Feature learning based on nonnegative matrix factorisation for speaker identification

Speaker: Romain Serizel Date: September 29, 2016 Abstract: The main target of speaker identification is to assert whether or not the speaker in an audio recording is known and if he/she is known, to find his/her identity. A recent trend is to use feature learning based approaches to overcome the limitations of hand-craft features. This talk …

A summary of the CHiME-4 Speech Separation and Recognition Challenge

Speaker: Emmanuel Vincent Date: September 22, 2016 Outline: 1. From CHiME-1 to CHiME-3 2. Environment, simulation, and microphone mismatches in CHiME-3 3. CHiME-4 tracks and baselines 4. Discussion

City-identification of Flickr videos using semantic acoustic features

Speaker: Benjamin Elizalde (Carnegie Mellon University) Date: July 7, 2016 Abstract: City-identification of videos aims to determine the likelihood of a video belonging to a set of cities. In this paper, we present an approach using only audio, thus we do not use any additional modality such as images, user-tags or geo-tags. In this manner, we show …

Multimodal acquisition platform

Speaker: Valerian Girard (Engineer) Date: June 23, 2016 Abstract: In this talk, I will present my work during this year where I have contributed in developing a multimodal acquisition platform that records multimodal data in speech communication context. The platform can record motion capture data of the face, the arms and hands with and without markers using …

Modelling Context of OOV Words in Large Vocabulary Continuous Speech Recognition

Speaker: Imran Sheikh (PhD student) Date: June 16, 2016 Abstract: The diachronic nature of broadcast news content causes frequent variations in the linguistic content and vocabulary, leading to Out-Of-Vocabulary (OOV) words and specially OOV proper names. OOVs missed by the speech recognition system can be recovered by a dynamic vocabulary multi-pass recognition approach in which relevant proper …

