Xiaofei LI

I am an assistant Professor at Westlake University, Hangzhou, China, since March 2020. Please visit my new webpage: https://lixiaofei-audio.github.io/

I work in the PERCEPTION team at INRIA Grenoble Rhône-Alpes as a post-doctoral researcher from February 2014 to January 2016, and as a starting research scientist since February 2016. I was involved in the EARS project, ERC VHIA project and Samsung Electronics Lito project of PERCEPTION team. My research interests include multi-microphone speech processing for sound source localization, separation and dereverberation, single microphone signal processing for noise estimation, voice activity detection, and speech enhancement.

I received a Bachelor degree in Electronic Information from Beijing Institute of Machinery, China in 2007. I did my PhD in Electronics at Peking University, China during 2007 to 2013.

Contact

INRIA Grenoble Rhone-Alpes
655, avenue de l’Europe
38330 Montbonnot Saint-Martin
France
Email: xiaofei dot li at inria dot fr

Publications

GoogleScholar

2019

Multichannel Online Dereverberation based on Spectral Magnitude Inverse Filtering [pdf] [audio examples] [matlab code]
Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud
IEEE/ACM Transactions on Audio, Speech and Language Processing, 27 (9), pp. 1365 – 1377, 2019.
Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments [pdf] [research page] [matlab code]
Xiaofei Li, Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud
IEEE Journal of Selected Topics in Signal Processing, 13 (1), pp. 88 – 103, 2019.
Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function [pdf] [matlab code]
Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud
IEEE/ACM Transactions on Audio, Speech and Language Processing, 27 (3), pp. 645 – 659, 2019.
Audio-noise Power Spectral Density Estimation Using Long Short-term Memory [pdf] [test python code and data]
Xiaofei Li, Simon Leglaive, Laurent Girin, Radu Horaud
IEEE Signal Processing Letters, 26 (6), pp. 918 – 922, 2019.
Expectation-Maximization for Speech Source Separation using Convolutive Transfer Function [pdf] [matlab code]
Xiaofei Li, Laurent Girin, Radu Horaud
CAAI Transactions on Intelligent Technologies, 4 (1), pp. 47 – 53, 2019.
Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering
Bing Yang, Hong Liu, Cheng Pang, Xiaofei Li
IEEE/ACM Transactions on Audio, Speech and Language Processing, 27 (8), pp. 1241 – 1255, 2019.
Multitask Learning of Time-Frequency CNN for Sound Source Localization [pdf]
Cheng Pang, Hong Liu, Xiaofei Li
IEEE Access, vol.7, pp. 40725 – 40737, 2019.

Multichannel Speech Enhancement Based on Time-frequency Masking Using Subband Long Short-Term Memory [pdf] [audio examples]
Xiaofei Li and Radu Horaud
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct 2019, New Paltz, NY, United States.
Audio-Visual Variational Fusion for Multi-Person Tracking with Robots [pdf ]
Xavier Alameda-Pineda, Soraya Arias, Yutong Ban, Guillaume Delorme, Laurent Girin, Radu Horaud, Xiaofei Li, Bastien Mourgue, Guillaume Sarrazin
ACMMM 2019 – 27th ACM International Conference on Multimedia, Oct 2019, Nice, France. pp.1059-1061.

Narrow-band Deep Filtering for Multichannel Speech Enhancement [pdf] [research page]
Xiaofei Li and Radu Horaud
preprint

2018

Audio source separation into the wild [pdf]
Laurent Girin, Sharon Gannot, Xiaofei Li
Multimodal Behavior Analysis in the Wild, Academic Press (Elsevier), Computer Vision and Pattern Recognition, 〈10.1016/B978-0-12-814601-9.00022-5〉, pp. 53-78, 2018.

Multichannel Identification and Nonnegative Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function [pdf] [audio examples] [matlab code]
Xiaofei Li, Sharon Gannot, Laurent Girin, Radu Horaud
IEEE/ACM Transactions on Audio, Speech and Language Processing, 26 (10), pp. 1755 – 1768, 2018.
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion [research page]
Israel D. Gebru, Silèye Ba, Xiaofei Li, Radu Horaud
IEEE Transactions on pattern analysis and machine intelligence, 40 (5), pp. 1086 – 1099, 2018.

Online Localization of Multiple Moving Speakers in Reverberant Environments [pdf] [matlab code]
Xiaofei Li, Bastien Mourgue, Laurent Girin, Sharon Gannot and Radu Horaud
IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), July 2018, Sheffield, UK.
Multisource MINT Using the Convolutive Transfer Function [pdf] [matlab code]
Xiaofei Li, Sharon Gannot, Laurent Girin, Radu Horaud
IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Canada.
A Cascaded Multiple-Speaker Localization and Tracking System [pdf] [research page] [matlab code]
Xiaofei Li, Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud
Proceedings of the LOCATA Challenge Workshop – a satellite event of IWAENC 2018, Sep 2018, Tokyo, Japan. pp.1-5.
Accounting for Room Acoustics in Audio-Visual Multi-Speaker Tracking
Yutong Ban, Xiaofei Li, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud
IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Canada.

2017

Multiple-Speaker Localization Based on Direct-Path Features and Likelihood Maximization with Spatial Sparsity Regularization [pdf] [research page]
Xiaofei Li, Laurent Girin, Radu Horaud and Sharon Gannot
IEEE/ACM Transactions on Audio, Speech and Language Processing, 25 (10), pp. 1997 – 2012, 2017.
Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping
Cheng Pang, Hong Liu, Jie Zhang and Xiaofei Li
IEEE/ACM Transactions on Audio, Speech and Language Processing, 25 (8), pp. 1618 – 1632, 2017.

An EM algorithm for audio source separation based on the convolutive transfer function [pdf] [matlab code]
Xiaofei Li, Laurent Girin, Radu Horaud
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct 2017, New Paltz, NY, United States.
Audio Source Separation based on Convolutive Transfer Function and Frequency-Domain Lasso Optimization [pdf ] [matlab code]
Xiaofei Li, Laurent Girin, Radu Horaud
IEEE International Conference on Audio, Speech and Signal Processing (ICASSP), Mar 2017, New Orleans, United States.

2016

Estimation of the Direct-Path Relative Transfer Function for Supervised Sound-Source Localization [pdf] [matlab code] [research page]
Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016, 24 (11), pp. 2171 – 2186.
A Novel Lip Descriptor for Audio-Visual Keyword Spotting Based on Adaptive Decision Fusion [pdf]
Pingping Wu, Hong Liu, Xiaofei Li, Ting Fan, Xuewu Zhang
IEEE Transactions on Multimedia 18(3), pp. 326-338, 2016.

Reverberant Sound Localization with a Robot Head Based on Direct-Path Relative Transfer Function [pdf]
Xiaofei Li, Laurent Girin, Fabien Badeig, Radu Horaud
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct 2016, Daejeon, South Korea.
Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding [pdf]
Xiaofei Li, Radu Horaud, Laurent Girin, Sharon Gannot
International Workshop on Acoustic Signal Enhancement (IWAENC), Sep 2016, Xi’an, China.
Non-Stationary Noise Power Spectral Density Estimation Based on Regional Statistics [pdf] [matlab code]
Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud
IEEE International Conference on Audio, Speech and Signal Processing (ICASSP), Mar 2016, Shangai, China.

2015

A Distributed Architecture for Interacting with NAO [pdf]
Fabien Badeig, Quentin Pelorson, Soraya Arias, Vincent Drouard, Israel Dejene Gebru, Xiaofei Li, Georgios Evangelidis, Radu Horaud
International Conference on Multimodal Interaction (ICMI), Nov 2015, Seattle, WA, United States.
Local Relative Transfer Function for Sound Source Localization [pdf]
Xiaofei Li, Radu Horaud, Laurent Girin, Sharon Gannot
The European Signal Processing Conference (Eusipco), Aug 2015, Nice, France.
Estimation of Relative Transfer Function in the Presence of Stationary Noise Based on Segmental Power Spectral Density Matrix Subtraction [pdf] [matlab code]
Xiaofei Li, Laurent Girin, Radu Horaud, Sharon Gannot
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2015, Brisbane, Australia.

Before 2015

Sound Source Localization for HRI Using FOC-based Time Difference Feature and Spatial Grid Matching [pdf]
Xiaofei Li and Hong Liu
IEEE Transactions on Cybernetics, 43 (4), pp. 1199-1212, 2013
Real-time Sound Source Localization for Mobile Robot Based on Guided Spectral-Temporal Position Method [pdf]
Xiaofei Li, Miao Shen, Wenmin Wang and Hong Liu
International Journal of Advanced Robotic Systems, 2012, vol.9, 78:2012
A survey of sound source localization for robot audition
Xiaofei Li and Hong Liu
CAAI Transactions on Intelligent Systems, 7 (1), pp. 9-20, 2012. (in Chinese)

A Two-Layer Probabilistic Model Based on Time-Delay Compensation for Binaural Sound Localization [pdf]
Hong Liu, Zhuo Fu and Xiaofei Li
IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, 6-10, May, 2013
Time Delay Estimation for Speech Signal Based on FOC-Spectrum [pdf]
Hong Liu and Xiaofei Li
International Conference on INTERSPEECH, Portland, Oregon, USA, 2012:1732-1735
Sound Source Localization for Human-Robot Interaction Based on Spatial Distribution of Time Difference Feature and Grid Matching [pdf]
Xiaofei Li, Hong Liu and Xuesong Yang
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011.
A Selection Method of Speech Vocabulary for Human-Robot Speech Interaction [pdf]
Hong Liu and Xiaofei Li
IEEE International Conference on Systems, Man and Cybernetics (SMC), Istanbul, Turkey, 2010:2243-2248

What do you want to do ?

New mail