Research – Page 2 – RobotLearn

Speech Modeling with a Hierarchical Transformer Dynamical VAE

Xavier ALAMEDA-PINEDA 2023/05/17 2024/03/11Research, Software, Sound

by Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, and Xavier Alameda-Pineda IEEE International Conference on Acoustics, Speech and Signal Processing 2023 [paper][code] Abstract: The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep generative models that extends the VAE to model a sequence of observed data and a…

Back to MLP: A Simple Baseline for Human Motion Prediction

Wen GUO 2023/04/07 2024/03/11Research, Software, Vision

by Wen Guo*, Yuming Du*, Xi Shen, Vincent Lepetit, Xavier Alameda-Pineda, and Francesc Moreno-Noguer IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023, Waikoloa, Hawaii [paper] [code] [HAL] Abstract. This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences. State-of-the-art…

Learning and controlling the source-filter representation of speech with a variational autoencoder

Xavier ALAMEDA-PINEDA 2023/04/07 2024/03/11Research, Software, Sound

by Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier SpeechCom, 2023 [arXiv] [HAL] [code] [examples] Abstract: Understanding and controlling latent representations in deep generative models is a challenging yet important problem for analyzing, transforming and generating various types of data. In speech processing, inspiring from the anatomical mechanisms…

Variational meta-reinforcement learning for social robotics

Xavier ALAMEDA-PINEDA 2023/03/13 2024/03/11Reinforcement Learning, Research, Software

by Anand Ballou, Xavier Alameda-Pineda, and Chris Reinke Applied Intelligence [paper][code] Abstract: With the increasing presence of robots in our everyday environments, improving their social skills is of utmost importance. Nonetheless, social robotics still faces many challenges. One bottleneck is that robotic behaviors often need to be adapted, as social…

Successor Feature Representations

Chris REINKE 2023/01/04 2024/03/11Reinforcement Learning, Research, Software

by Chris Reinke and Xavier Alameda-Pineda Transactions on Machine Learning Research [Paper][Code] Abstract. Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate…

Expression-preserving face frontalization improves visually assisted speech processing

Radu HORAUD 2022/12/16 2024/03/11Research, Sound, Vision

by Zhiqi Kang, Mostafa Sadeghi, Radu Horaud and Xavier Alameda-Pineda International Journal of Computer Vision, 2023, 131 (5), pp.1122-1140 [arXiv] [HAL] [webpage] Abstract. Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations in order to boost…

DAUMOT: Domain Adaptation for Unsupervised Multiple Object Tracking

Yihong XU 2022/05/31 2024/03/07Research, Software, Vision

By Guillaume Delorme*, Yihong Xu*, Luis G. Camara, Elisa Ricci, Radu Horaud, Xavier Alameda Pineda [arXiv] [paper] [code] Abstract: Existing works on multiple object tracking (MOT) are developed under the traditional supervised learning setting, where the training and test data are drawn from the same distribution. This hinders the development…

Continual Attentive Fusion for Incremental Learning in Semantic Segmentation

Xavier ALAMEDA-PINEDA 2022/04/28 2022/04/28Research, Vision

Guanglei Yang, Enrico Fini, Dan Xu, Paolo Rota, Mingli Ding, Hao Tang, Xavier Alameda-Pineda, Elisa Ricci IEEE Transactions on Multimedia [arXiv][HAL] Abstract. Over the past years, semantic segmentation, similar to many other tasks in computer vision, has benefited from the progress in deep neural networks, resulting in significantly improved performance. However, deep architectures trained…

A Proposal-based Paradigm for Self-supervised Sound Source Localization in Videos

Xavier ALAMEDA-PINEDA 2022/04/28 2022/04/28Research, Vision

Hanyu Xuan, Zhiliang Wu, Jian Yang, Yan Yan, Xavier Alameda-Pineda IEEE/CVF International Conference on Computer Vision (CVPR) 2022, New Orleans, US [HAL] Abstract. Humans can easily recognize where and how the sound is produced via watching a scene and listening to corresponding audio cues. To achieve such cross-modal perception on machines, existing methods…

Continual Models are Self-Supervised Learners

Xavier ALAMEDA-PINEDA 2022/04/28 2024/03/07Research, Software, Vision

by Enrico Fini, Victor G. Turrisi da Costa, Xavier Alameda-Pineda, Elisa Ricci, Karteek Alahari, Julien Mairal IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022, New Orleans, USA [arXiv][Code][HAL] Abstract. Self-supervised models have been shown to produce comparable or better visual representations than their supervised counterparts when trained offline on unlabeled data at scale. However,…