Anastasiia TSUKANOVA

Author's posts

Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

Orateur: Adrien Dufraux Paper by Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze, submitted to ASRU 2019 Date: le 5 sep, 2019 à 10h30 – C005 tl;dr: Learn an ASR model from noisy transcriptions. At training time, we search better transcriptions by incorporating a noise model into a differentiable beam search algorithm. Résumé: The …

Continue reading

Introduction of semantic information in speech recognition

Orateur : Stéphane Level Date : le 12 septembre 2019 à 10h30 – C005 Résumé : Automatic Speech Recognition (ASR) is a growing industry. Indeed, there is an increasing demand from the industry for recognition systems or voice commands. The industrial use of this technology requires to have reliable and performing methodology. Current automatic speech …

Continue reading

Prochains séminaires

Pour consulter le planning, veuillez cliquer « Continue reading »

Continue reading

Natural Language Processing: Online hate speech against migrants

Orateur : Ashwin Geet D’Sa Date : le 29 août 2019 à 10h30 – C005 Résumé : The spectacular expansion of the Internet led to the development of a new sector in the natural language processing field: automatic Hate Speech detection, as in many countries hate speech is prohibited. There is no clear and formal …

Continue reading

Semi-supervised triplet loss based learning of ambient audio embeddings

Orateur : Nicolas Turpault Date : le 2 mai 2019 à 10h30 – C005 Résumé : Deep neural networks are particularly useful to learn relevant representations from data. Recent studies have demonstrated the potential of unsupervised representation learning for ambient sound analysis using various flavors of the triplet loss. They have compared this approach to …

Continue reading

Expressive Text Driven Audio-Visual Speech Synthesis

Orateur : Sara Dahmani Date : le 29 avril 2019 à 140h00 – C005 Résumé : In recent years, the performance of speech synthesis systems has been improved thanks to deep learning-based models, but generating expressive audiovisual speech is still an open issue. The variational auto-encoders (VAE)s are recently proposed to learn latent representations of …

Continue reading

Traitement automatique et mesure d’intelligibilité de la parole pathologique

Orateur : Imed Laaridh Date : le 24 avril 2019 à 14h00 – C005 Résumé : L’intelligibilité de la parole est au cœur des interactions humaines. La problématique de son évaluation intéresse particulièrement la qualité de la transmission de la parole à travers différents milieux ou transducteurs, la compréhensibilité de la parole en cas de …

Continue reading

VoiceTechnology and Studio Maia

Orateurs : Mathieu Hu et Yassine Boudi Date : le 28 mars 2019 à 10h30 – C103 Résumé : VoiceTechnology is a joint project between Inria innovation Lab and the recording studio Studio Maia. The goal of the project was to adapt methods of automatic speech recognition, speaker spotting and speech synthesis to the specific …

Continue reading

Hunting Echoes for Auditory Scene Analysis

Orateur : Diego DI CARLO Date : le 21 mars 2019 à 10h30 – C005 Résumé : Did you remember the Marvel’s movie about the superhero Daredevil? He is blind, but thanks to an enhanced hearing he become a radar-man: he can visualize the sound propagation and so retrieve an image of the surrounding space. …

Continue reading

Generative FLOW for expressive speech synthesis

Orateur : Ajinkya Kulkarni Date : le 14 février 2019 à 10h30 – C005 Résumé : Recently, Generative FLOW based architecture have been proposed for generating high quality image generation. The major challenges in machine learning domain are ability to learn the representation from few data points and ability to generate new data from learned …

Continue reading