Author's posts
Feb 09
Towards robust distant speech segmentation in meetings using microphone arrays
Speaker: Théo Mariotte Data and time: Feb 9, 2023, at 10:30 Abstract: Speaker diarization answers the question « Who spoke and when? » in an audio stream. Most diarization systems consist of two major steps: segmentation and clustering. The former is related to speakers activity and detects time borders in the signal. The latter groups segments featuring …
Jan 19
Transfer Learning for Abusive Language Detection
Speaker: Tulika Bose Data and time: Jan 19, 2023, at 10:30 Abstract: The proliferation of social media, despite its multitude of benefits, has led to the increased spread of abusive language. Deep learning models for detecting abusive language have displayed great levels of in-corpus performance but under-perform substantially outside the training distribution. Moreover, they require …
Nov 24
BinauRec: A dataset to test the influence of the use of room impulse responses on binaural speech enhancement
Speaker: Louis Delebecque Data and time: Nov 24, 2022, at 10:30 Abstract: Thanks to spatial information contained in reverberated signals, multichannel speech enhancement (SE) algorithms are able to outperform single-channel systems. Reverberated signals are often generated from simulations of room impulse responses (RIRs). However, the influence of such methods on SE quality has not been investigated …
Nov 03
Training speech emotion classifier without categorical annotations
Speaker: Meysam Shamsi Data and place: Nov 3, 2022, at 10:30 Abstract: Emotion recognition task can be treated as a classification using categorical labeling or regression modeling using dimensional description in continuous space. An investigation of the relation between these two representations will be presented, then a classification pipeline that uses only dimensional annotation will be …
Sep 29
Flexible parametric spatial audio processing and spatial acoustic scene analysis research, in Aalto and Tampere University, Finland.
Speaker: Archontis Politis Data and place: Sep 29, 2022, at 10:30 – Hybrid Abstract: Archontis Politis is a researcher on spatial audio technologies currently at Tampere University, Finland, and in close collaboration with Aalto University, Finland. This presentation summarizes work that the researcher has been involved in those two universities, mainly around two areas. The first …
Sep 22
Sound event detection for low power embedded systems
Speaker: Marie-Anne Lacroix Data and place: September 22, 2022, at 10:30 – Hybrid Abstract: Supervised sound event detection software implementations currently achieve high performance. This allows the development of real-world applications, especially for the growing up domain of the Internet of Objects (IoT). However, current performance is achieved at the cost of hard computational complexity and …
Aug 04
[PhD position] Visually-assisted Speech Enhancement
Context: This is a fully-funded PhD position defined within the context of the ANR project REAVISE (Robust and Efficient Deep Learning-based Audiovisual Speech Enhancement), which aims at developing a unified, robust, and generalizable audiovisual speech enhancement framework. The PhD candidate will work in the MULTISPEECH, Inria Nancy – Grand Est., France, under the co-supervision of Mostafa Sadeghi (researcher, Inria), …
Jun 09
Time-frequency fading
Speaker: Marina Kreme Data and place: June 9, 2022, at 10:30 – Hybrid Abstract: We are interested in the problem of attenuating time-frequency regions, for example when a disturbance signal is well localized in the time-frequency plane. We approach this problem from the point of view of time-frequency filtering, by formulating the optimization problem in the signal …
Jun 08
[Research Engineer or Post-doc] Robust and Generalizable Deep Learning-based Audio-visual Speech Enhancement
Context: The Multispeech team, at Inria Nancy, France, seeks a qualified candidate to work on signal processing and machine learning techniques for robust audiovisual speech enhancement. The candidate will be working under the co-supervision of Mostafa Sadeghi (researcher, Multispeech team), Xavier Alameda-Pineda (researcher and team leader of RobotLearn team), and Radu Horaud (senior researcher, RobotLearn team). Starting date & duration: October 2022 (flexible), for a duration of …
May 19
On the impact of normalization strategies in unsupervised adversarial domain adaptation for acoustic scene classification
Speaker: Mauricio Michel Olvera Zambrano Data and place: May 19, 2022, at 10:30 – Hybrid Abstract: Acoustic scene classification systems face performance degradation when trained and tested on data recorded by different devices. Unsupervised domain adaptation methods have been studied to reduce the impact of this mismatch. While they do not assume the availability of labels at …