Wednesday, 22 March 2017, 10:30 – 11:30 am, room F107, INRIA Montbonnot
Seminar by Simon Leglaive, Telecom ParisTech, Paris
Abstract: We tackle the problem of multichannel audio-source separation in under-determined reverberant mixtures. The aim of this talk is to present source separation approaches that can take advantage of prior knowledge on the mixing filters, in order to guide their estimation. In a first part we consider an approximate convolutive mixing model in the short-time Fourier transform domain. The mixing process is thus represented by the frequency response of the mixing filters. We propose to characterize early reverberation with an autoregressive model while according to statistical room acoustics results, late reverberation is represented by an autoregressive moving average model. These models aim to transcribe the temporal characteristics of the mixing filters into frequency-domain correlations. Blind source separation is then achieved thanks to an expectation-maximization algorithm. In a second part we investigate a time-domain convolutive mixture modeling approach, while keeping a time-frequency source model based on non-negative matrix factorization. In this context the convolutive mixing process is exactly modeled. Variational inference of the time-frequency source coefficients is then performed from the time-domain mixture observations. We show that this approach leads to a good separation quality in a semi-blind setting where the mixing filters are assumed to be known. It is also suitable for incorporating simple priors on the impulse response of the mixing filters that could help us to achieve a fully blind source separation setting.
Bio: Simon Leglaive is a Ph.D. student at LTCI, Télécom ParisTech, Université Paris-Saclay, within the AAO group, TSI departement. Simon is working towards a Ph.D. under the supervision of Roland Badeau and Gaël Richard. His thesis focuses on under-determined reverberant audio source separation. Simon’s interests include statistical models for audio signals, audio source separation and machine learning applied to audio signal processing.