Speaker: Romain Serizel
Date: September 29, 2016
The main target of speaker identification is to assert whether or not the speaker in an audio recording is known and if he/she is known, to find his/her identity. A recent trend is to use feature learning based approaches to overcome the limitations of hand-craft features. This talk will review the dominant paradigm (the so-called I-vector approach) and will propose an alternative solution based on group nonnegative matrix factorisation (NMF). The scalability issue in NMF-based approaches will then be considered. Solutions taking advantage of stochastic mini-batch processing and general purpose graphical processing units (GPGPU) will be proposed, aiming at highly efficient feature learning for large scale datasets.