Deep learning-based speaker localization and speech separation from Ambisonics recordings

Orateur : Laureline Pérotin

Date : le 22 novembre 2018 à 10h30 – C005

Résumé :

Personal assistants are flourishing, but it is still hard to achieve voice control in adverse conditions, whenever noise, other speakers, reverberation or furniture reflections are present. Preprocessings such as speaker localization and speech enhancement have shown to help automatic speech recognition. I will present deep learning-based processings that are designed for Ambisonics recordings, an audio format that is particularly suited to the spatial representation of soundfields.