Return to Research


European Projects

H2020 COMPRISE: Cost-effective, Multilingual, Privacy-driven voice-enabled Services

Duration: December 2018 to November 2021
Coordinator: Inria – Other partners: Universität des Saarlandes (DE), Netfective Technology (FR), Ascora (DE), Tilde (LV), Rooter Analysis (ES)
COMPRISE will define a fully private-by-design methodology and tools that will reduce the cost and increase the inclusiveness of voice interaction technologies.
Project website >>  Twitter >>  Linkedin >>

ANR-DFG IFCASL: Individualized feedback in computer-assisted spoken language learning

Duration: March 2013 – February 2016
Coordinator: Jürgen Trouvain (Saarland University) – Other partners: Saarland University (COLI department)
The main objective of IFCASL is to investigate learning of oral French by German speakers, and oral German by French speakers at the phonetic level. The work has mainly focused on the design of a corpus of French sentences read by German speakers learning French, a corpus of German sentences read by French speakers, and tools for annotating French and German corpora.
Project website >>

EUREKA Eurostars i3DMusic: Real-time Interactive 3D Rendering of Musical Recordings

Duration: October 2010 to March 2014
Coordinator: Audionamix (FR) – Other partners: Inria, EPFL (CH), Sonic Emotion (CH)
The i3DMusic project aims to enable real-time interactive respatialization of mono or stereo music content. This is achieved through the combination of source separation and 3D audio rendering techniques. Inria is responsible for the source separation work package, more precisely for designing scalable online source separation algorithms and estimating advanced spatial parameters from the available mixture.
Project website >>

National Projects

ANR REFINED: Real-Time Artificial Intelligence for Hearing Aids

Duration: Mar 2022 – Mar 2026
Coordinator: Inria – Other partners: CEA List (Saclay), Institut de l’audition (Paris)
The Refined project brings together audiologists, computer scientists and specialists about hardware implementation to design new speech enhancement algorithms that fit both the needs of patients suffering of hearing losses and the computational constraints of hearing aid devices.

PEPR Cybersécurité, projet iPOP: Protection des données personnelles

Duration: Oct 2022 – Sep 2028
Coordinator: Inria – Other partners: Inria PRIVATICS (Lyon), COMETE, PETRUS (Saclay), MAGNET, SPIRALS (Lille), IRISA (Rennes), LIFO (Bourges), DCS (Nantes), CESICE (Grenoble), EDHEC (Lille), CNIL (Paris)
The objectives of iPOP are to study the threats on privacy introduced by new digital technologies, and to design privacy-preserving solutions compatible with French and European regulations. Within this scope, Multispeech focuses on speech data.

ANR Full3DTalkingHead: Synthèse articulatoire phonétique

Duration: Apr 2021 – Sep 2024
Coordinator: Inria – Other partners: Gipsa-Lab (Grenoble), IADI (Nancy), LPP (Paris)
The objective is to realize a complete three-dimensional digital talking head including the vocal tract from the vocal folds to the lips, the face and integrating the digital simulation of the aero-acoustic phenomena.

ANR JCJC DENISE: Tackling hard problems in audio using Data-Efficient Non-linear InverSe mEthods

Duration: October 2020 – September 2024
Coordinator: Inria – Other partners: UMR AE, Institut de Recherche Mathématiques Avancées de Strasbourg, Institut de Mathématiques de Bordeaux
DENISE aims to explore the applicability of recent breakthroughs in the field of nonlinear inverse problems to audio signal reparation and to room acoustics, and to combine them with compact machine learning models to yield data-efficient techniques.

Action Exploratoire Inria Acoust.IA: l’Intelligence Artificielle au Service de l’Acoustique du Bâtiment

Duration: Oct 2020 – Sep 2023
Coordinator: Inria – Other partners: UMR AE
This project aims at radically simplifying and improving the acoustic diagnosis of rooms and buildings using new techniques combining machine learning, signal processing and physics-based modeling.

ANR DEEP-PRIVACY: Distributed, Personalized, Privacy-Preserving Learning for Speech Processing

Duration: Jan 2019 – June 2023
Coordinator: Inria – Other partners: Inria Lille (team MAGNET), LIA (Avignon), LIUM (Le Mans)
The objective of the DEEP-PRIVACY project is to elaborate a speech transformation that hides the speaker identity for an easier sharing of speech data for training speech recognition models; and to investigate speaker adaptation and distributed training.
Project website >>

ANR BENEPHIDIRE: Stuttering: Neurology, Phonetics, Computer Science for Diagnosis and Rehabilitation

Duration: March 2019 – December 2023
Coordinator: Praxiling (Montpellier) – Other partners: LORIA (Nancy), INM (Montpellier), LiLPa (Strasbourg).
The BENEPHIDIRE project brings together neurologists, speech-language pathologists, phoneticians, and computer scientists specializing in speech processing to investigate stuttering as a speech impairment and to develop techniques for diagnosis and rehabilitation.
Project website >>

ANR HAIKUS: Artificial Intelligence applied to augmented acoustic Scenes

Duration: December 2019 – November 2023
Coordinator: Ircam (Paris) – Other partners: Inria (Nancy), IJLRA (Paris)
HAIKUS aims to achieve seamless integration of computer-generated immersive audio content into augmented reality (AR) systems. One of the main challenges is the rendering of virtual auditory objects in the presence of source movements, listener movements and/or changing acoustic conditions.
Project website >>

ANR LEAUDS: Learning to understand audio scenes

Duration: April 2019 – March 2023
Coordinator: Université de Rouen Normandie/LITIS – Other partners: Inria, Netatmo
LEAUDS aims to make a leap towards machines that fully understand ambient audio input by achieving breakthroughs in three intertwined directions: detection of thousands of audio events from little annotated data, robustness to “out-of-the lab” conditions, and language-based description of audio scenes.

ANR ROBOVOX: Robust Vocal Identification for Mobile Security Robots

Duration February 2019 – January 2023
Coordinator Université d’Avignon/LIA – Other partners: Inria, A.I. Mergence
ROBOVOX aims at providing voice identification solution in the context of a mobile security robot. This implies robustness to real conditions such as ambient noise, reverberation and moving sources and microphones.

Inria Project Lab HyAIAI: Hybrid Approaches for Interpretable AI

Duration: September 2019 – August 2023
Coordinator: Inria team Lacodam – Other partners: Inria teams Magnet, Multispeech, Orpailleur, SequeL, and TAU
Recent progress in Machine Learning (ML) and especially Deep Learning has made ML pervasive in a wide range of applications. However, current approaches rely on complex numerical models: the decisions they propose, as accurate as they may be, cannot be easily explained to the layman who may depend on these decisions (ex: get a loan or not). In the HyAIAI IPL, we tackle the problem of making ML interpretable through the study and design of hybrid approaches that combine state of the art numerical models with explainable symbolic models. More precisely, our goal is to be able to integrate high level (domain) constraints into ML models, to give model designers information on ill-performing parts of the model, and to give the layman/practitioner understandable explanations on the results of the ML model.

InriaHub ADT PEGASUS: rehaussement de la ParolE Généralisé par Apprentissage SUperviSé

Duration: Nov 2020 – Oct 2022
Coordinator: Inria – Other partners: UMR AE, Institut de Recherche Mathématiques Avancées de Strasbourg, Institut de Mathématiques de Bordeaux
This engineering project aims at further developing, expanding and transfering the Asteroid speech enhancement and separation toolkit recently released by the team

ANR JCJC DiSCogs: Distant speech communication with heterogeneous unconstrained microphone arrays

Duration: September 2018 – August 2022
Coordinator: Université de Lorraine/Loria
DiSCogs aims at providing new hands-free and flexible communication solutions, exploiting the many devices equipped with microphones that populate our everyday life. In particular, we propose to recast the problem of synchronizing devices at the signal level as a multi-view learning problem aiming at extracting complementary information from the devices
at hand.

ANR DFG M-PHASIS: Migration and Patterns of Hate Speech in Social Media – A Cross-cultural Perspective

Duration: March 2018 – August 2022
Coordinator: Inria – Other partners: CREM Université de Lorraine, JGUM Johannes Gutenberg-Universität Mainz, SAAR Saarland University Saarbrücken
Focusing on the social dimension of hate speech, the project M-PHASIS seeks to study the patterns of hate speech related to migrants in user-generated content.

ANR VOCADOM: Robust voice command adapted to the user and to the context for ambient assisted living

Duration: January 2017 – December 2020
Coordinator: CNRS/LIG – Other partners: Inria, Université Lyon 2/GREPS, THEORIS
The goal of this project is to design a robust voice control system for smart home applications. Multispeech is responsible for wake-up word detection, overlapping speech separation, and speaker recognition.
Project website >>

FUI voiceHome: Voice control for smart home and multimedia appliances

Duration: February 2015 – January 2018
Coordinator: onMobile SA – Other partners: Delta Dore SA, Technicolor Connected Home Rennes SNC, Orange SA, eSoftThings SAS, Inria, IRISA, LOUSTIC
The goal of the project is to conceive and implement a natural language voice interface for smart home and multimedia (set-top-box) appliances. Inria is responsible for the robust recognition of spoken commands (cf. demo).

ANR DYCI2: Creative dynamics of improvised interaction

Duration: October 2014 – September 2018
Coordinator: Gérard Assayag (Ircam) – Other partners: Inria, University of La Rochelle
The project involves the creation, adaptation and implementation of effective and efficient models of artificial listening, machine learning, interaction and on-line creation of musical content, to enable the establishment of digital musical agents. These autonomous and creative agents will be able to integrate in an artistically credible way diverse human settings such as live collective performance, (post-)production, pedagogy.
Project website >>

ANR ContNomina: Exploitation of context for proper names recognition in diachronic audio documents

Duration: February 2013 – July 2016
Coordinator: Irina Illina (LORIA) – Other partners: LIA, Synalp
The project ContNomina focuses on the problem of proper names in automatic audio processing systems by exploiting in the most efficient way the context of the processed documents.

ANR ORFEO: Outils et Ressources pour le Français Ecrit et Oral

Duration: February 2013 – February 2016
Coordinator: Jeanne-Marie Debaisieux (Université Paris 3) – Other partners: ATILF, CLLE-ERSS, ICAR, LIF, LORIA, LATTICE, MoDyCo
The main objective of the ORFEO project is the constitution of a corpus for the study of contemporary French.
Project website >>

Equipex ORTOLANG: Open Resources and Tools for Language

Duration: September 2012 – May 2016
Coordinator: Jean-Marie Pierrel (ATILF) – Other partners: LPL, LORIA, Modyco, LLL, INIST
Project website >>

FUI RAPSODIE: Automatic Speech Recognition for Hard of Hearing or Handicapped People

Duration: March 2012 – February 2016
Coordinator: eRocca SAS – Other partners: CEA Grenoble, Inria, Castorama SA
The goal of the project is to realize a portable device that will help a hard of hearing person to communicate with other people.
Project website >>

Bilateral Contracts with Industry

Contract with Meta AI

Duration: May 2022 – Apr 2025
This CIFRE grant funds a PhD on self-supervised disentangled representation learning of audio data for compression and generation.

Contract with Meta AI

Duration: Nov 2018 – Feb 2022
This CIFRE grant funded a PhD on cost-effective weakly supervised learning for automatic speech recognition.

Contract with Vivoka

Duration: Oct 2021 – Oct 2024
This contract funds a PhD on joint and embedded automatic speech separation, diarization and recognition for the generation of meeting minutes.

Contract with Dolby

Duration: September – December 2018
This contract aimed to evaluate the feasibility of state-of-the-art source separation technology for several use cases, and to identify those which could be commercially exploited.

Contract with Honda Research Intitute Japan

Duration: February 2018 – March 2019
This contract targets collaborative research on multichannel speech and audio processing and eventual software licensing in order to enable voice-based communication in challenging noisy and reverberant conditions in which current hands-free voice-based interfaces perform poorly.

Inria Innovation Lab Voice Technologie (with Studio MAIA)

Duration: July 2017 – March 2019
Supported by Bpifrance
This Inria Innovation Lab aims to develop a software suite for voice processing in the multimedia creation chain. The software is aimed at sound engineers and it will rely on the team’s expertise in speech enhancement, robust speech and speaker recognition, and speech synthesis.

Contract with Samsung

Duration: January – November 2017
This project aimed to transfer our speech enhancement software for hands-free voice command applications.

Contract with Studio MAIA

Duration: September 2014 – August 2015
Supported by Bpifrance
A pre-study contract was signed to investigate speech processing tools that could eventually be transferred as plugins for audio mixing software. Prosody modification, noise reduction, and voice conversion are of special interest.

Contract with Venatech SAS

Duration: June 2014 – August 2017
Supported by Bpifrance
The project aims to design a real-time control system for wind farms that will maximize energy production while limiting sound nuisance. This will leverage our know-how on audio source separation and uncertainty modeling and propagation.