Multimodal speech animation

Speaker: Louis Abel

Data and place: May 12, 2022, at 10:30 – Hybrid

Abstract: Multimodal speech animation is the next step to speech synthesis, combining visuals with audio allows the creation of embodied conversational agent (ECA) which can convey more information than a classic text-to-speech approach, several works have been done in the team to progress in the task of creating natural-looking and realistic ECA, this implies 2 things: a high degree of precision in the animation of the lips, the jaw, the eyes, … but also a realistic avatar. In this talk, I will present one work on talking head relying on the use of segmentation given by tools like Kaldi, and how we were able to remove such constraints by using only features extracted from the audio itself. I will also present the evolution of the animation system, which now allows us to create a realistic ECA. Those two parts are key points for my Ph.D. thesis on audiovisual speech generation in an interaction context.