Category: News

Sparse representation, dictionary learning, and deep neural networks: their connections and new algorithms

Seminar  by Mostafa Sadeghi, Sharif University of Technology, Tehran Tuesday 19 June 2018, 14:30 – 15:30, room F107 INRIA Montbonnot Saint-Martin Abstract. Over the last decade, sparse representation, dictionary learning, and deep artificial neural networks have dramatically impacted on the signal processing and machine learning areas by yielding state-of-the-art results in a variety of tasks, …

Continue reading

Deep Regression Models and Computer Vision Applications for Multiperson Human-Robot Interaction

PhD defense by Stéphane Lathuilière Tuesday 22nd May 2018, 11:00, Grand Amphithéatre INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin Abstract: In order to interact with humans, robots need to perform basic perception tasks such as face detection, human pose estimation or speech recognition. However, in order have a natural interaction with humans, the robot needs to model …

Continue reading

Audio-Visual Analysis in the Framework of Humans Interacting with Robots

PhD defense by Israel D. Gebru Friday 13 April 2018, 9:30 – 10:30, Grand Amphithéatre INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin In recent years, there has been a growing interest in human-robot interaction (HRI), with the aim to enable robots to naturally interact and communicate with humans. Natural interaction implies that robots not only need to …

Continue reading

Plane Extraction from Depth-Data

The following journal paper has just been published: Richard Marriott, Alexander Pashevich, and Radu Horaud. Plane Extraction from Depth Data Using a Gaussian Mixture Regression Model. Pattern Recognition Letters. vol. 110, pages 44-50, 2018. The paper is free for download from our publication page or directly from Elsevier.

Software engineer / Audio-visual perception for robotics

Context Perception team (, at INRIA Grenoble Rhône-Alpes and Jean Kuntzman Laboratory at Grenoble Alpes University, works on computational models for mapping images and sounds onto meaning and actions. The team members address these challenging topics: computer vision, auditory signal processing and scene analysis, machine learning, and robotics. In particular, we develop methods for the …

Continue reading

A Bayesian Framework for Head Pose Estimation and Tracking

PhD defense by Vincent Drouard Monday 18 December 2017, 11:00 – 12:00, Grand Amphithéatre INRIA Montbonnot Saint-Martin In this thesis, we address the well-known problem of head-pose estimation in the context of human-robot interaction (HRI). We accomplish this task in a two step approach. First, we focus on the estimation of the head pose from …

Continue reading

IEEE/RSJ IROS’17: Novel Technology Paper Award Finalist!

Yutong Ban (PhD student) and his co-authors, Xavier Alameda-Pineda, Fabien Badeig, and Radu Horaud, were among the five finalists of the “Novel Technology Paper Award for Amusement Culture” at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, September 2017, for their paper Tracking a Varying Number of People with a Visually-Controlled …

Continue reading

ERC Proof of Concept (PoC) Grant Awarded to Radu Horaud

As an ERC Advanced Grant holder, Radu Horaud was awarded a Proof of Concept grant for his project Vision and Hearing in Action Laboratory (VHIALab). The project will develop software packages enabling companion robots to robustly interact with multiple users.

IEEE HSCMA’17: Best Paper Award!

Israel Dejene Gebru (PhD student) and his co-authors, Christine Evers, Patrick Naylor (both from Imperial College London) and Radu Horaud, received the best paper award at the IEEE Fifth Joint Workshop on Hands-free Speech Communication and Microphone Arrays, San Francisco, USA, 1-3 March 2017, for their paper Audio-visual Tracking by Density Approximation in a Sequential …

Continue reading

Audio-visual diarization dataset now available for download

We just made public our novel AVDIAR dataset. AVDIAR stands for “audio-visual diarization”. The dataset contains recordings of social gatherings done with two cameras and six microphones. Both the audio and visual data were carefully annotated, such that it is possible to evaluate the performance of various algorithms, such as person tracking, speech-source localization, speaker …

Continue reading