Speaker: Emmanuel Vincent Date: September 22, 2016 Outline: 1. From CHiME-1 to CHiME-3 2. Environment, simulation, and microphone mismatches in CHiME-3 3. CHiME-4 tracks and baselines 4. Discussion
Category: Seminars
Jul 07
City-identification of Flickr videos using semantic acoustic features
Speaker: Benjamin Elizalde (Carnegie Mellon University) Date: July 7, 2016 Abstract: City-identification of videos aims to determine the likelihood of a video belonging to a set of cities. In this paper, we present an approach using only audio, thus we do not use any additional modality such as images, user-tags or geo-tags. In this manner, we show …
Jun 23
Multimodal acquisition platform
Speaker: Valerian Girard (Engineer) Date: June 23, 2016 Abstract: In this talk, I will present my work during this year where I have contributed in developing a multimodal acquisition platform that records multimodal data in speech communication context. The platform can record motion capture data of the face, the arms and hands with and without markers using …
Jun 16
Modelling Context of OOV Words in Large Vocabulary Continuous Speech Recognition
Speaker: Imran Sheikh (PhD student) Date: June 16, 2016 Abstract: The diachronic nature of broadcast news content causes frequent variations in the linguistic content and vocabulary, leading to Out-Of-Vocabulary (OOV) words and specially OOV proper names. OOVs missed by the speech recognition system can be recovered by a dynamic vocabulary multi-pass recognition approach in which relevant proper …
Jun 09
Formant shifting for speech intelligibility improvement in car noise environment
Speaker: Karan Nathwani (post-doctoral fellow) Date: June 9, 2016 Abstract: In this work, we propose a novel approach aiming at improving the intelligibility of speech in the context of in-car applications. Speech produced in noisy environments is subject to the Lombard effect which gathers a number of voice transformation effects compared to the speech produced in calm …
Jun 02
A step towards multidimensional automatic improvisation
Speaker: Ken Déguernel (PhD student) Date: June 2, 2016 Abstract: Automatic music improvisation systems based on the OMax paradigm use training over a one-dimensional sequence to generate original improvisation. First, we propose a system creating improvisation in a closer way to a human improviser where the intuition of a context is enriched with knowledge. This system combines …
May 12
A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions
Speaker: Sunit Sivasankaran (Engineer) Date: May 12, 2016 Abstract: Robustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. We conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR …
May 11
Optimal transport for domain adaptation
Speaker: Alain Rakotomamonjy (Université de Rouen) Date: May 11, 2016 Abstract: Domain adaptation addresses one of the most challenging tasks in machine learning : coping with mismatch between learning and testing probability distributions. If adaptation is done correctly, models learned on a specific data representation become more robust when confronted to data depicting the same problems, but described through another …
Apr 23
Compact Multiview Representation of Documents Based on the Total Variability Space
Speaker: Mohamed Bouallegue (post-doctoral fellow) Date: April 21, 2016 Abstract: In this talk, I present my research work during my thesis at Laboratoire Informatique d’Avignon and my postdoctoral research at Laboratoire d’Informatique de l’Université du Maine. This work explores the paradigm of Factor Analysis/i-vector for identification of topics in spoken documents. We identify themes from dialogues of …
Mar 03
Introduction to Sum Product Networks for noisy speech recognition
Speaker: Juan Andrés Morales Cordovilla (post-doctoral fellow) Date: March 3, 2016 Abstract: Sum Product Networks (SPN) are a new kind of probabilistic models that have the advantages of Deep learning of Neural Networks (DNNs) and of exact marginalization of Gaussian Mixture Models (GMMs). These two properties are very useful to do Missing Data or Uncertainty Decoding on the …