Expression-preserving face frontalization improves visually assisted speech processing

by Zhiqi Kang, Mostafa Sadeghi, Radu Horaud and Xavier Alameda-Pineda International Journal of Computer Vision, 2023, 131 (5), pp.1122-1140   [arXiv] [HAL] [webpage] Abstract. Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations in order to boost…

Continue reading

Learnable Geometric Reconstruction and UV Parameterization for Digitizing Humans in Loose Garments

Avinash Sharma, IIIT-Hyderabad, India Friday, 7  October 2022, 15:00-16:00, room 106, Laboratoire Jean Kuntzman, 700 avenue Centrale, Saint-Martin d’Hères Webex link : https://inria.webex.com/inria/j.php?MTID=m1c3e564d9357c0f8e26f899f631464db Abstract: This talk primarily aims at providing an overview of our research contribution towards devising learnable paradigms for 3D digitization of human body in loose garments. In the…

Continue reading

Seminar: LAEO-Net++: Revisiting People Looking at Each Other in Videos

Manuel  J. Marin-Jimenez, University of Cordoba, Spain Thursday, 7  July 2022, 14:00-15:00, room F107, Inria Montbonnot Saint-Martin Attend online: https://inria.webex.com/inria/j.php?MTID=mb256349fcf231701cb7e004536b4f398 Abstract: Capturing the ‘mutual gaze’ of people is essential for understanding and interpreting the social interactions between them. To this end, this paper addresses the problem of detecting people Looking At Each…

Continue reading

Seminar: Machine Learning for Indoor Acoustics

Antoine Deleforge, Multispeech team, Inria Nancy Grand-Est Wednesday, 15 June 2022, 15:30, room F107, Inria Montbonnot Saint-Martin Attend online: https://inria.webex.com/inria/j.php?MTID=m30df5cc25af1cc7f052683154f4f7638 Abstract: Close your eyes, clap your hands. Can you hear the shape of the room? Is there carpet on the floor? Answering these peculiar questions may have applications in acoustic diagnosis,…

Continue reading

The impact of removing head movements on audio-visual speech enhancement

by Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar ICASSP’22, Singapore [paper][examples][code][slides] Abstract. This paper investigates the impact of head movements on audio-visual speech enhancement (AVSE). Although being a common conversational feature, head movements have been ignored by past and recent studies: they challenge today’s learning-based…

Continue reading

[Closed] Master of science internship: Dynamic face modeling for audio-visual speech processing

The analysis of human faces has been a thoroughly investigated topic for the last decades, leading to highly performant 2D and 3D face representations and face recognition models and systems. Nevertheless, the analysis of face movements has been, comparatively, much less investigated. Face movements play a crucial role in human-to-human,…

Continue reading

Robust Face Frontalization For Visual Speech Recognition

by Zhiqi Kang, Radu Horaud and Mostafa Sadeghi ICCV’21 Workshop on Traditional Computer Vision in the Age of Deep Learning (TradiCV’21) [paper (extended version)][code][bibtex] Abstract. Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution is a robust method that preserves non-rigid facial deformations, i.e….

Continue reading

Fullsubnet: a full-band and sub-band fusion model for real-time single-channel speech enhancement

By Xiang Hao*,#, Xiangdong Su#, Radu Horaud and Xiaofei Li* (*Westlake University, #Inner Mongolia University, China) ICASSP 2021 [arXiv][github][youtube] Abstract. This paper proposes a full-band and sub-band fusion model, named as FullSubNet, for single-channel real-time speech enhancement. Full-band and sub-band refer to the models that input full-band and sub-band noisy…

Continue reading