Category: Internship proposals

Repairing audio signals using compact phase-aware models

Dates: 5 to 6 months, starting date in March or April 2021 (flexible). Location: Inria Nancy – Grand Est, team MULTISPEECH Supervisors: Antoine Deleforge (antoine (dot) deleforge (at) inria (dot) fr), Inria Researcher Please apply by sending your CV and a short motivation letter directly to Antoine Deleforge. Motivation and context The field of audio signal …

Continue reading

Understanding the temporal scales of language recognition: the impact of duration over utterance-level prediction

Supervisors Raphaël Duroselle, PhD student (raphael.duroselle@loria.fr) Denis Jouvet, DR Inria (denis.jouvet@inria.fr) Irina Illina, MdC, HDR (irina.illina@loria.fr) Motivation and Context The spoken language recognition task consists in predicting a label from recordings of variable duration. The structure of the problem is very different from automatic speech recognition where an output sequence of variable length is predicted. …

Continue reading

Hearing the Shape of a Room: Towards Acoustic Super-Resolution

Dates: 5 to 6 months, starting date in March or April 2021 (flexible). Location: The internship time will be shared between the environmental acoustic laboratory UMRAE (www.umrae.fr) of the Cerema of Strasbourg and the Inria team TONUS at the Institut de Recherche en Mathématiques Avancées (IRMA https://irma.math.unistra.fr/) of Strasbourg University. Supervisors: Antoine Deleforge (antoine (dot) deleforge …

Continue reading

Multi-task learning for hate speech classification

Supervisors: Irina Illina, MdC, HDR (irina.illina@loria.fr) Ashwin Geet D’Sa, PhD Thesis student (ashwin-geet.dsa@loria.fr) Dominique Fohr, CR CNRS (dominique.fohr@loria.fr) Motivation and context During the last years, online communication through social media has skyrocketed. Although most people use social media for constructive purposes, few misuse these platforms to spread hate speech. Hate speech is anti-social communicative behavior …

Continue reading

Semantic information from the past in a speech recognition system: does the past help the present?

Supervisor: Irina Illina, MdC, HDR,  illina@loria.fr Dominique Fohr, CR CNRS, dominique.fohr@loria.fr Motivation and contexte Semantic and thematic spaces are vector spaces used for the representation of words, sentences or textual documents. The corresponding models and methods have a long history in the field of computational linguistics and natural language processing. Almost all models rely on …

Continue reading

Disentanglement of Latent Codes in Dynamical Variational Autoencoders

Context: Deep latent variable models (DLVMs) provide an effective way to model the underlying hidden generative process of natural signals and images [1]. This allows us to approximate the probability density functions of data which in turn can be used for either generating new examples resembling training data or do probabilistic inference and estimation. Variational …

Continue reading

Switching Variational Autoencoders for Audio-visual Speech Separation

Context: Over the past years, variational autoencoders (VAEs) have proven efficient for generative modeling of complicated signals, e.g. speech and audio [1]. Recently, they have successfully been applied to audio-visual speech separation (AVSS) [2], where the goal is to separate a target speech from a mixture of several speech signals, utilizing the visual information of …

Continue reading