Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

by Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda and Laurent Girin IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2022. [arXiv][Code] Abstract. Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variables, dedicated to model time series of high-dimensional data. DVAEs can be considered as extensions of…

Continue reading

A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling

by Xiaoyu Bie, Laurent Girin, Simon Leglaive, Thomas Hueber and Xavier Alameda-Pineda Interspeech’21, Brno, Czech Republic [paper][slides][code][bibtex] Abstract. The Variational Autoencoder (VAE) is a powerful deep generative model that is now extensively used to represent high-dimensional complex data via a low-dimensional latent space learned in an unsupervised manner. In the…

Continue reading