Xavier ALAMEDA-PINEDA – Page 2

A Multimodal Dynamical Variational Autoencoder for Audiovisual Speech Representation Learning

Xavier ALAMEDA-PINEDA 2023/12/13 2024/03/11Research, Software

by Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, and Renaud Séguier Neural Networks [paper][demo][code] Abstract: In this paper, we present a multimodal \textit{and} dynamical VAE (MDVAE) applied to unsupervised audio-visual speech representation learning. The latent space is structured to dissociate the latent dynamical factors that are shared between the…

Unsupervised Performance Analysis of 3D Face Alignment with a Statistically Robust Confidence Test

Xavier ALAMEDA-PINEDA 2023/12/08 2024/03/11Research, Software, Vision

by Mostafa Sadeghi, Xavier Alameda-Pineda and Radu Horaud Neurocomputing, volume 564, January 2024 [Code & Data] Abstract: We address the problem of analyzing the performance of 3D face alignment (3DFA), or facial landmark localization. Performance analysis is usually based on annotated datasets. Nevertheless, in the particular case of 3DFA, the…

Motion-DVAE: Unsupervised learning for fast human motion denoising

Xavier ALAMEDA-PINEDA 2023/11/03 2024/03/11Research, Software, Vision

by Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, and Renaud Séguier ACM SIGGRAPH Conference on Motion, Interaction and Games [paper][code] Abstract: Pose and motion priors are crucial for recovering realistic and accurate human motion from noisy observations. Substantial progress has been made on pose and shape estimation from images, and recent…

On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers

Xavier ALAMEDA-PINEDA 2023/10/03 2024/03/11Research, Software, Vision

by Thomas De Min, Massimiliano Mancini, Karteek Alahari, Xavier Alameda-Pineda, and Elisa Ricci ICCV 2023 Workshops [paper][code] Abstract: State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting. However, there is a tradeoff between the number of learned parameters and the…

A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation

Xavier ALAMEDA-PINEDA 2023/08/29 2024/03/11Research, Software, Sound, Vision

by Louis Airale, Dominique Vaufreydaz, and Xavier Alameda-Pineda [paper][code] Abstract: Animating still face images with deep generative models using a speech input signal is an active research topic and has seen important recent progress. However, much of the effort has been put into lip syncing and rendering quality while the…

Unsupervised speech enhancement with deep dynamical generative speech and noise models

Xavier ALAMEDA-PINEDA 2023/08/13 2024/03/11Research, Software, Sound

by Xiaoyu Lin, Simon Leglaive, Laurent Girin, and Xavier Alameda-Pineda Interspeech 2023 [paper][code] Abstract: This work builds on previous work on unsupervised speech enhancement using a dynamical variational autoencoder (DVAE) as the clean speech model and non-negative matrix factorization (NMF) as the noise model. We propose to replace the NMF…

Semi-supervised learning made simple with self-supervised clustering

Xavier ALAMEDA-PINEDA 2023/06/13 2024/03/11Research, Software, Vision

by Enrico Fini, Pietro Astolfi, Karteek Alahari, Xavier Alameda-Pineda, Julien Mairal, Moin Nabi, and Elisa Ricci IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 [paper][code] Abstract: Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially…

Speech Modeling with a Hierarchical Transformer Dynamical VAE

Xavier ALAMEDA-PINEDA 2023/05/17 2024/03/11Research, Software, Sound

by Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, and Xavier Alameda-Pineda IEEE International Conference on Acoustics, Speech and Signal Processing 2023 [paper][code] Abstract: The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep generative models that extends the VAE to model a sequence of observed data and a…

Learning and controlling the source-filter representation of speech with a variational autoencoder

Xavier ALAMEDA-PINEDA 2023/04/07 2024/03/11Research, Software, Sound

by Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier SpeechCom, 2023 [arXiv] [HAL] [code] [examples] Abstract: Understanding and controlling latent representations in deep generative models is a challenging yet important problem for analyzing, transforming and generating various types of data. In speech processing, inspiring from the anatomical mechanisms…

Variational meta-reinforcement learning for social robotics

Xavier ALAMEDA-PINEDA 2023/03/13 2024/03/11Reinforcement Learning, Research, Software

by Anand Ballou, Xavier Alameda-Pineda, and Chris Reinke Applied Intelligence [paper][code] Abstract: With the increasing presence of robots in our everyday environments, improving their social skills is of utmost importance. Nonetheless, social robotics still faces many challenges. One bottleneck is that robotic behaviors often need to be adapted, as social…