Category: [:en]Seminars[:fr]Séminaires[:]

Articulatory synthesis in the entire audible frequency range

Speaker: Rémi Blandin from TU Dresden Date and place: February 18, 2021 at 10:30, VISIO-CONFERENCE Abstract: Speech sounds are produced by multiple complex physical phenomena such as fluid structure interaction or turbulent flow. One use greatly simplified description of them to simulate speech production. As an example, the vocal tract (the air volume from the …

Continue reading

Sound Event Localization And Detection Based on CRNN using rectangular filters and channel rotation Data Augmentation

Speaker: Francesca Ronchini Date and place: January 28, 2021 at 10:15, VISIO-CONFERENCE Abstract: Sound Event Localization and Detection refers to the problem of identifying the presence of independent or temporally-overlapped sound sources, correctly identifying to which sound class it belongs, and estimating their spatial directions while they are active. In the last years, neural networks …

Continue reading

CRNN vs. Self-Attention in Source Localization and an Introduction of the Project HAIKUS

Speaker: Prerak Srivastava Date and place: January 21, 2021 at 10:15, VISIO-CONFERENCE Abstract: The seminar will consist of two parts, the first part will be related to DOA estimation work done during my master internship, and the latter part will describe some preliminary results about the project HAIKUS. Recently, RNN based CRNN architecture made the state …

Continue reading

Complex-valued and hybrid models for audio processing

Speaker: Paul Magron Date and place: January 14, 2021 at 10:30, VISIO-CONFERENCE Abstract: In this talk, I will give an overview of my work, which main application is sound source separation, the task of automatically extracting constitutive components from their observed mixture in an audio recording. I will address it in the time-frequency domain, which …

Continue reading

End-to-End Spoken Language Understanding and Privacy Preserving Speech Processing

Speaker: Natalia Tomashenko Date and place: January 7, 2021 at 10:30, VISIO-CONFERENCE Abstract: This talk is related to two different topics: (1) e2e SLU from speech and (2) privacy preserving speech processing, as well as to the discussion of challenges of these research areas and perspective research directions. (1) E2e SLU from speech focuses on …

Continue reading

Semi-supervised and Weakly Supervised Training of Speech Recognition Models

Speaker: Imran Sheikh Date and place: December 17, 2020 at 10:30, VISIO-CONFERENCE Abstract: Automatic Speech Recognition (ASR) is now available in the form of cloud services as well as deployable open-source tools. However, poor performance due to mismatch with the domain of end applications still limits their usage; especially with limited amount of labeled/unlabelled in-domain …

Continue reading

Implicit and explicit phase modeling in deep learning-based source separation

Speaker: Manu Pariente Date and place: December 3, 2020 at 10:30, VISIO-CONFERENCE Abstract: Speech enhancement and separation have recently seen great progress thanks to deep learning-based discriminative methods.In particular, time domain methods relying on learned filterbanks achieve state-of-the-art performance by implicitly modeling phase and amplitude. Despite current efforts against those limitations, these methods produce very …

Continue reading

Non-native speech recognition

Speaker: Ismaël Bada Date and place: November 19, 2020 at 10:30, VISIO-CONFERENCE Abstract: We propose a method for lexicon adaptation in order to improve the automatic speech recognition (ASR) of non-native speakers. ASR suffers from a significant drop in performance when it is used to recognize the speech of non-native speakers, since the phonemes of …

Continue reading

Label Propagation-Based Semi-Supervised Learningfor Hate Speech Classification

Speaker: Ashwin Geet D’Sa Date and place: November 12, 2020 at 10:30, VISIO-CONFERENCE Abstract: Research on hate speech classification has received increased attention. In real-life scenarios, a small amount of labeled hate speech data is available to train a reliable classifier. Semi-supervised learning takes advantage ofa small amount of labeled data and a large amount of unlabeled data. …

Continue reading

MRI of the Vocal Tract and Articulators’ Automatic Delineation

Speaker: Karyna Isaieva Date and place:November 5, 2020 at 10:30, VISIO-CONFERENCE Abstract: MRI is a very popular technology that enables fully non-invasive and non-ionizing investigation of the vocal tract. It has multiple applications including studies of healthy speech as well as some medical applications (pathological speech studies, swallowing, etc.). We acquired a database of 10 …

Continue reading