How to anonymise a voice without altering the message or producing a robotic voice? This was the question asked to the participants of the 1st edition of the VoicePrivacy Challenge. Completed in November 2020, this international competition on voice data protection was an initiative of Multispeech, a project-team common to Loria and Inria used to launch research challenges in the field of speech technology.
Objective: to collect a large volume of speech data without transmitting the identities of their speakers. Voice anonymisation is at the heart of the VoicePrivacy challenge and is a concern for the Multispeech team, which has made speech, a vast research domain, its field of excellence. Whether the team aims to enhance a speech signal in the presence of background noise, to identify a language or to translate signs and facial language, its work helps us better understand how we speak and how we perceive the words spoken by others. Of course, Multispeech’s research also helps machines to better interact with their human interlocutors. Denis Jouvet leads this team of more than 40 researchers and engineers: “Speech recognition and synthesis systems are based on deep learning (a form of machine learning), which requires large quantities of speech data for optimal quality.” To preserve the anonymity of the speakers, there are secure systems that process the data directly on the user’s terminal, but in order to also use this data for machine learning purposes, Multispeech is developing voice anonymisation tools. Tools that they have been able to compare with those proposed by the teams participating in the VoicePrivacy Challenge, a new meeting place for speech specialists.
“Challenges are driving our scientific community and their complexity is increasing every year”
Numerous competitions punctuate the life of this community. Multispeech organised the SPCup 2019 challenge, where participants had to develop a sound source localisation system using a drone for search and rescue applications. The team also co-organises 4 regular international challenges: DCASE, for which Inria proposes the task of detecting and identifying sound events in a domestic environment; the CHiME challenge, which involves a difficult speech separation and recognition task (identifying who is speaking and transcribing the words spoken in a spontaneous conversation between friends in a noisy environment); the ASVspoof challenge, which aims to detect spoofed audio signals in order to secure voice-based biometric authentication systems and, since 2020, the VoicePrivacy Challenge. “These different challenges are driving our scientific community, says Denis Jouvet. The organisers provide basic software that each team can improve and modify. So, with each new edition, the challenge complexity increases.” It’s a healthy competition where universities and industry meet, which gives Inria international visibility. The important thing is to participate, and above all to analyse the results in order to advance the state of the art in science and technology.
Written by : Kogito