Catégorie : Séminaires

Expressive Text Driven Audio-Visual Speech Synthesis

Orateur : Sara Dahmani Date : le 29 avril 2019 à 140h00 – C005 Résumé : In recent years, the performance of speech synthesis systems has been improved thanks to deep learning-based models, but generating expressive audiovisual speech is still an open issue. The variational auto-encoders (VAE)s are recently proposed to learn latent representations of …

Lire la suite

Traitement automatique et mesure d’intelligibilité de la parole pathologique

Orateur : Imed Laaridh Date : le 24 avril 2019 à 14h00 – C005 Résumé : L’intelligibilité de la parole est au cœur des interactions humaines. La problématique de son évaluation intéresse particulièrement la qualité de la transmission de la parole à travers différents milieux ou transducteurs, la compréhensibilité de la parole en cas de …

Lire la suite

VoiceTechnology and Studio Maia

Orateurs : Mathieu Hu et Yassine Boudi Date : le 28 mars 2019 à 10h30 – C103 Résumé : VoiceTechnology is a joint project between Inria innovation Lab and the recording studio Studio Maia. The goal of the project was to adapt methods of automatic speech recognition, speaker spotting and speech synthesis to the specific …

Lire la suite

Hunting Echoes for Auditory Scene Analysis

Orateur : Diego DI CARLO Date : le 21 mars 2019 à 10h30 – C005 Résumé : Did you remember the Marvel’s movie about the superhero Daredevil? He is blind, but thanks to an enhanced hearing he become a radar-man: he can visualize the sound propagation and so retrieve an image of the surrounding space. …

Lire la suite

Generative FLOW for expressive speech synthesis

Orateur : Ajinkya Kulkarni Date : le 14 février 2019 à 10h30 – C005 Résumé : Recently, Generative FLOW based architecture have been proposed for generating high quality image generation. The major challenges in machine learning domain are ability to learn the representation from few data points and ability to generate new data from learned …

Lire la suite

SING: Symbol-to-Instrument Neural Generator

Orateur : Alexandre Défossez Date : le 10 janvier 2019 à 13h00 – B011 Résumé : Recent progress in deep learning for audio synthesis opens the way to models that directly produce the waveform, shifting away from the traditional paradigm of relying on vocoders or MIDI synthesizers for speech or music generation. Despite their successes, …

Lire la suite

Deep learning-based speaker localization and speech separation from Ambisonics recordings

Orateur : Laureline Pérotin Date : le 22 novembre 2018 à 10h30 – C005 Résumé : Personal assistants are flourishing, but it is still hard to achieve voice control in adverse conditions, whenever noise, other speakers, reverberation or furniture reflections are present. Preprocessings such as speaker localization and speech enhancement have shown to help automatic …

Lire la suite

Analysis and development of speech enhancement features in cochlear implants

Orateur : Nicolas Furnon Date : le 18 octobre 2018 à 10h30 – C005 Résumé : Cochlear implants (CIs) are complex systems developed to restore the hearing sense to people with profound auditory loss. These solutions are efficient in quiet environments but adverse situations remain very challenging for CI users. A range of algorithms are …

Lire la suite

Processus alpha-stable pour le traitement du signal

Orateur : Mathieu Fontaine Date : le 27 septembre 2018 à 10h30 – C005 Résumé : Le sujet scientifique de la séparation de sources sonores (SSS) vise à décomposer les signaux audio en leurs éléments constituants, par exemple en séparant la voix du chanteur principal de son accompagnement musical ou du bruit de fond. Dans …

Lire la suite

Adversarial Neural Networks for Language Identification

Orateur : Raphaël Duroselle Date : le 12 juillet 2018 à 10h30 – C103 Résumé : Language identification systems are very common in speech processing and are used to classify the spoken language given a recorded audio sample. They are often used as a front-end for subsequent processing tasks such as automatic speech recognition or speaker …

Lire la suite