Deep learning-based speaker localization and speech separation from Ambisonics recordings

Speaker: Laureline Pérotin

Date: November 22, 2018 at 10:30 – C005


Personal assistants are flourishing, but it is still hard to achieve voice control in adverse conditions, whenever noise, other speakers, reverberation or furniture reflections are present. Preprocessings such as speaker localization and speech enhancement have shown to help automatic speech recognition. I will present deep learning-based processings that are designed for Ambisonics recordings, an audio format that is particularly suited to the spatial representation of soundfields.