Category: Job Offers

Master Internship on Audio-visual speech separation using variational auto-encoders

Topic: In this Master thesis, we address the problem of speech separation given single-channel microphone mixed speech and video frames of the involved speakers. Although there exist several audio-only speech separation methods [1], here, we aim to utilize also the visual information, that is, video frames of speakers’ lips. This would help to distinguish different …

Continue reading

Master Internship on face alignment for audio-visual speech enhancement

In many audio-visual applications, e.g., speech enhancement and speech recognition, it is desirable to have aligned images of the mouth region such that a deep neural network can extract reliable visual features. Indeed, the quality of the extracted visual features impacts the performance of audio-visual based applications. In reality, however, a speaker’s face is constantly …

Continue reading

Master Internship on Deep Speaker Recognition

Topic: Identification models have witnessed major improvements with the recent development of deep learning, especially when applied to the visual domain, contributing to the development of face-recognition [1] and person re-identification [2]. However,  comparable performances are yet to be achieved when applied to audio-based speaker recognition. Recent dataset assembling efforts [3] leverage the use of …

Continue reading

Master internship on deep Bayesian filtering

In signal processing and in computer vision, some of the most powerful tracking methods are based on the Kalman filter. The latter belongs to the unsupervised class of machine learning techniques and may well be viewed either as the simplest dynamic Bayesian network (DBN) or as a state-space model: it recursively predicts over time a …

Continue reading

[Closed] Master Internship on Robust Deep Regression

Topic: In this Master thesis we address the problem of how to robustly train a ConvNet for regression, or deep robust regression [1,2]. Traditionally, deep regression employs the L2 loss function [3], known to be sensitive to outliers, i.e. samples that either lie at an abnormal distance away from the majority of the training samples, …

Continue reading

Researcher on Deep and Reinforcement Learning for Robotics

Starting Date: February 1st, 2020. Funding: The H2020 ICT SPRING Project Contact Point: Xavier Alameda-Pineda Duration: From 2 and up to 4 years. To apply: https://jobs.inria.fr/public/classic/fr/offres/2019-02083 General Context: SPRING – Socially Pertinent Robots in Gerontological Healthcare – is a 4-year R&D project fully funded by the European Comission under the H2020 framework. SPRING aims to develop …

Continue reading

Engineer on Deep Learning and Cloud Computing

Starting Date:November 1st, 2019 – February 1st, 2020. Funding: The H2020 ICT SPRING Project Contact Point: Xavier Alameda-Pineda Duration: 2 years and up to 4 years. To apply: https://jobs.inria.fr/public/classic/fr/offres/2019-02081 General Context:  SPRING – Socially Pertinent Robots in Gerontological Healthcare – is a 4-year R&D project fully funded by the European Comission under the H2020 framework. SPRING …

Continue reading

Engineer on Deep Learning and Robotics

Starting Date: November 1st, 2019 – February 1st, 2020. Duration: 2 years and up to 4 years. Funding: The H2020 ICT SPRING Project Contact Point: Xavier Alameda-Pineda To apply: https://jobs.inria.fr/public/classic/fr/offres/2019-02082 General Context: SPRING – Socially Pertinent Robots in Gerontological Healthcare – is a 4-year R&D project fully funded by the European Comission under the H2020 framework. …

Continue reading

(Closed) MSc. Project on Speaker identity modeling with deep learning for re-identification

MSc. Project on Speaker identity modeling with deep learning for re-identification Short description: Speaker identification is the task that aims at determining which speaker has produced a given utterance [1]. On the other hand, speaker verification or re-identification aims at determining whether there is a match between a given speech utterance and a target speaker …

Continue reading

(Closed) MSc. Project on Coupled Audio-visual Multi-speaker Tracking

MSc. Project on Coupled Audio-visual Multi-speaker Tracking Short description: Multi-speaker tracking has been widely investigated and the Perception team contributed with a consistent methodological framework based on variational Bayes techniques [1-4]. Often, audio-visual tracking methods first map all auditory and visual information in the same space, to later on run a tracking algorithm. However, in …

Continue reading