Vision – RobotLearn

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Xavier ALAMEDA-PINEDA 2025/03/11 2025/04/11Research, Vision

by Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Francesc Moreno-Noguer IEEE International Conference on Computer Vision and Pattern Recognition [ paper ] [ code ] Abstract: Human Mesh Recovery (HMR) from a single RGB image is a highly ambiguous problem, as similar 2D projections can correspond to multiple 3D interpretations. Nevertheless,…

Lost and found: Overcoming detector failures in online multi-object tracking

Xavier ALAMEDA-PINEDA 2024/09/02 2025/04/11Research, Uncategorized, Vision

by Lorenzo Vaquero, Yihong Xu, Xavier Alameda-Pineda, Víctor M Brea, Manuel Mucientes European Conference on Computer Vision [ paper ] [ code ] Abstract: Multi-object tracking (MOT) endeavors to precisely estimate the positions and identities of multiple objects over time. The prevailing approach, tracking-by-detection (TbD), first detects objects and then…

Vq-hps: Human pose and shape estimation in a vector-quantized latent space

Xavier ALAMEDA-PINEDA 2024/09/02 2025/04/11Research, Uncategorized, Vision

by Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno-Noguer European Conference on Computer Vision [ paper ] [ code ] Abstract: Previous works on Human Pose and Shape Estimation (HPSE) from RGB images can be broadly categorized into two main groups: parametric and non-parametric approaches. Parametric techniques leverage…

Learning for Companion Robots: Preparation and Adaptation

Xavier ALAMEDA-PINEDA 2024/07/11 2025/04/11Reinforcement Learning, Research, Sound, Uncategorized, Vision

Xavier Alameda-Pineda was a keynote speaker at RFIAP/cAP 2024, on the topic of Learning for Companion Robots: Preparation and Adaptation.

Autoregressive GAN for Semantic Unconditional Head Motion Generation

Xavier ALAMEDA-PINEDA 2023/12/13 2024/03/11Research, Software, Vision

by Louis Airale, Xavier Alameda-Pineda, Stéphane Lathuilière, and Dominique Vaufreydaz ACM Transactions on Multimedia Tools and Applications [paper][code] Abstract: We address the task of unconditional head motion generation to animate still human faces in a low-dimensional semantic space. Deviating from talking head generation conditioned on audio that seldom emphasizes realistic…

Unsupervised Performance Analysis of 3D Face Alignment with a Statistically Robust Confidence Test

Xavier ALAMEDA-PINEDA 2023/12/08 2024/03/11Research, Software, Vision

by Mostafa Sadeghi, Xavier Alameda-Pineda and Radu Horaud Neurocomputing, volume 564, January 2024 [Code & Data] Abstract: We address the problem of analyzing the performance of 3D face alignment (3DFA), or facial landmark localization. Performance analysis is usually based on annotated datasets. Nevertheless, in the particular case of 3DFA, the…

Motion-DVAE: Unsupervised learning for fast human motion denoising

Xavier ALAMEDA-PINEDA 2023/11/03 2024/03/11Research, Software, Vision

by Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, and Renaud Séguier ACM SIGGRAPH Conference on Motion, Interaction and Games [paper][code] Abstract: Pose and motion priors are crucial for recovering realistic and accurate human motion from noisy observations. Substantial progress has been made on pose and shape estimation from images, and recent…

On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers

Xavier ALAMEDA-PINEDA 2023/10/03 2024/03/11Research, Software, Vision

by Thomas De Min, Massimiliano Mancini, Karteek Alahari, Xavier Alameda-Pineda, and Elisa Ricci ICCV 2023 Workshops [paper][code] Abstract: State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting. However, there is a tradeoff between the number of learned parameters and the…

A Comprehensive Multi-scale Approach for Speech and Dynamics Synchrony in Talking Head Generation

Xavier ALAMEDA-PINEDA 2023/08/29 2024/03/11Research, Software, Sound, Vision

by Louis Airale, Dominique Vaufreydaz, and Xavier Alameda-Pineda [paper][code] Abstract: Animating still face images with deep generative models using a speech input signal is an active research topic and has seen important recent progress. However, much of the effort has been put into lip syncing and rendering quality while the…

Semi-supervised learning made simple with self-supervised clustering

Xavier ALAMEDA-PINEDA 2023/06/13 2024/03/11Research, Software, Vision

by Enrico Fini, Pietro Astolfi, Karteek Alahari, Xavier Alameda-Pineda, Julien Mairal, Moin Nabi, and Elisa Ricci IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 [paper][code] Abstract: Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially…