Category: Seminars

Jun 06

Upcoming team seminars

Continue reading

Jun 06

Semi-supervised learning with deep neural networks for relative transfer function inverse regression

Speaker: Emmanuel Vincent Date: June 07, 2018 Abstract: Prior knowledge of the relative transfer function (RTF) is useful in many applications but remains little studied. In this work, we propose a semi-supervised learning algorithm based on deep neural networks (DNNs) for RTF inverse regression, that is to generate the full-band RTF vector directly from the source-receiver …

Continue reading

May 29

Leveraging Word Contexts in Wikipedia for OOV Proper Nouns Recovery in Speech Recognition

Speaker: Badr Abdullah Date: May 31, 2018 Abstract: Automatic Speech Recognition (ASR) systems are usually trained on static data and a finite vocabulary. When a spoken utterance contains Out-Of-Vocabulary (OOV) words, ASR systems misrecognize these words as in-vocabulary words with similar acoustic properties, but with entirely different meaning. The majority of OOV words are information-rich proper …

Continue reading

May 23

Speech/non-speech segmentation for speech recognition

Speaker: Odile Mella and Dominique Fohr Date: May 24, 2018 Abstract: Multiple-input neural network-based residual echo suppression

Apr 09

Multiple-input neural network-based residual echo suppression

Speaker: Guillaume Carbajal Date: April 12 2018 Abstract: A residual echo suppressor (RES) aims to suppress the residual echo in the output of an acoustic echo canceler (AEC). Spectral-based RES approaches typically estimate the magnitude spectra of the near-end speech and the residual echo from a single input, that is either the far-end speech or the …

Continue reading

Mar 29

Multichannel speech separation with RNN from high-order ambisonics recordings

Speaker: Lauréline Pérotin Date: March 29, 2018 Abstract: We present a source separation system for high-order ambisonics (HOA) contents. We derive a multichannel spatial filter from a mask estimated by a long short-term memory (LSTM) recurrent neural network. We combine one channel of the mixture with the outputs of basic HOA beamformers as inputs to the …

Continue reading

Mar 14

VisArtico: multimodal visualization software – Present & future

Speakers: Slim Ouni and Sara Dahmani Date and place: March 19, 2018 – C005 Abstract: VisArtico is a multimodal visualization software (acoustic, articulatory, visual, gestural) that has been developed within the team. This software has undergone several changes over several years. In this seminar, we present the software: user interface, functionalities, capabilities, etc. As this software …

Continue reading

Feb 02

Feedback on text analysis and emotion recognition in voice using deep learning

Speaker: Nicolas Turpault Date: February 15, 2018 Abstract: – During my internship in a startup in London I developed a system to try to recognise emotion in voice. In this work we used some speech processing (MFCC) and then applied a RNN (LSTM) to predict the emotion in voice. We used SEMAINE and Avec databases to …

Continue reading

Jan 15

Biomechanical models of speech articulators to understand speech motor control

Speaker: Pascal Perrier (Gipsa-lab Grenoble) Date: January 18, 2018 Abstract: We have been working for the last 20 years on the development of 2D and the 3D biomechanical models of speech articulators in the aim to better understand (1) how speech movements are constrained, (2) which degrees of freedom speakers have to deal with the goals …

Continue reading

Nov 30

Arabic speech synthesis

Speaker: Amal Houidhek Date: November 30, 2017 Abstract: The first part of the presentation investigates statistical parametric speech synthesis (SPSS) of Modern Standard Arabic (MSA): Hidden Markov Models (HMM)-based speech synthesis system relies on a description of speech segments corresponding to phonemes, with a large set of features that represent phonetic, phonologic, linguistic and contextual aspects. …

Continue reading