Feb 21

PhD position: Deep-learning based speech enhancement with ad-hoc microphone arrays

General informations:

  • Location: Loria/Inria Nancy Grand-Est (Nancy, France) and LCTI/Télécom ParisTech (Paris, France)
  • Supervisors: Romain Serizel (Université de Lorraine, Loria), Irina Illina (Université de Lorraine, Loria), Slim Essid (Télécom ParisTech, LTCI)
  • Research theme: Perception, Cognition, Interaction
  • Project-team: Multispeech
  • Starting date: September 2018

Deadline to apply : April 30th 2018

Scientific context:

Speech is everywhere in our daily life. It is one of the most intuitive means of communication and chances are high that during a regular day you will have many spoken interactions. However, most of the computer applications that are based on speech communication rely on the assumption that a “clean” version of the speech is available which is rarely true in real-life scenarios. One solution to this noise problem is to apply so-called speech enhancement techniques that aim at extracting the speech component from a noisy speech mixture [1]. In the context of fast deployment of mobile devices with two or more microphones, nowadays almost everyone has access to many microphones at all times. However, exploiting multiple microphones from several devices (that form a so-called heterogeneous microphone array) is far from trivial [2, 3].

Missions:

Over the years, a large body of work has been devoted to multichannel speech enhancement algorithms: initially based on signal processing [4, 5] and more recently on deep learning [6]. The application of these algorithms to signals collected with an array composed of multiple devices requires some signal-level calibration and synchronization between devices, which is quite challenging. In this thesis, instead of considering each device as a part of a large array we will consider the signals from each device as a different view of the same acoustic scene. To solve the problem, we will investigate joint learning approaches based for example on deep learning [7, 8], nonnegative tensor co-factorization [9] or a combination of both.

Profile:

  • MSc in computer science, machine learning, signal processing
  • Experience with programming language Python
  • Experience with deep learning toolkits is a plus

References:

[1] Loizou, P. C. “Speech enhancement: theory and practice.” CRC Press, 2013

[2] Kako, T., Niwa, K., Kobayashi, K., and Ohmuro, H. “Wiener filter design by estimating sensitivities between distributed asynchronous microphones and sound sources.” In Proc of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2015), pp. 1–5.

[3] Doclo, S., Spriet, A., Wouters, J., and Moonen, M. “Frequency-domain criterion for the speech distortion weighted multichannel wiener filter for robust noise reduction.” Speech Communication 49, 7 (2007), 636–656.

[4] Serizel, R., Moonen, M., Van Dijk, B., and Wouters, J. “Low-rank Approximation Based Multichannel Wiener Filter Algorithms for Noise Reduction with Application in Cochlear Implants.” IEEE/ACM Transactions on Audio, Speech and Language Processing 22 (2014), 785–799.

[6] Nugraha, A. A., Liutkus, A. and Vincent, E. “Multichannel audio source separation with deep neural networks”, IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 9 (2016), 1652–1664.

[7] Wang, W., Arora, R., Livescu, K., and Bilmes, J. A. “On deep multi-view representation learning.” In Proc of the International Conference on Machine Learning (ICML) (2015), pp. 1083–1092.

[8] Andrew, G., Arora, R., Bilmes, J. A., and Livescu, K. “Deep canonical correlation analysis.” In Proc of the International Conference on Machine Learning (ICML) (2013), pp. 1247–1255.

[9] Seichepine, N., Essid, S., Févotte, C., and Cappé, O. “Soft nonnegative matrix co-factorization.” IEEE Transactions on Signal Processing 62, 22 (2014), 5940–5949.

Additional information:

The candidates are required to provide the following documents in a single pdf to Romain Serizel (https://members.loria.fr/RSerizel/): 

Deadline to apply : April 30th 2018

  • CV

  • A cover/motivation letter describing their interest in the topic

  • Degree certificates and transcripts for Bachelor and Master (or the last 5 years)

  • Master thesis (or equivalent) if it is already completed, or a description of the work in progress, otherwise

  • The publications (or web links) of the candidate, if any (it is not expected that they have any)

In addition, one recommendation letter from the person who supervises(d) the Master thesis (or research project or internship) should be sent directly by his/her author to the prospective PhD advisor.