Recent Publications

Below is the list of recent publications from the team. The exhaustive list of publications can be found here.

Publications HAL de la structure parole; multispeech

2024

Journal articles

titre: Training RNN Language Models on Uncertain ASR Hypotheses in Limited Data Scenarios
auteur: Imran Ahamad Sheikh, Emmanuel Vincent, Irina Illina
article: Computer Speech and Language, 2024, 83, pp.101555. ⟨10.1016/j.csl.2023.101555⟩
Accès au texte intégral et bibtex

titre: Automatic segmentation of vocal tract articulators in real-time magnetic resonance imaging
auteur: Vinicius Ribeiro, Karyna Isaieva, Justine Leclere, Jacques Felblinger, Pierre-André Vuissoz, Yves Laprie
article: Computer Methods and Programs in Biomedicine, In press, 243 (2), pp.107907. ⟨10.1016/j.cmpb.2023.107907⟩
Accès au texte intégral et bibtex

titre: Unsupervised Performance Analysis of 3D Face Alignment with a Statistically Robust Confidence Test
auteur: Mostafa Sadeghi, Xavier Alameda-Pineda, Radu Horaud
article: Neurocomputing, 2024, 564, pp.1-16. ⟨10.1016/j.neucom.2023.126941⟩
Accès au texte intégral et bibtex

Conference papers

titre: Analyse textuelle de manuscrits mayas et égyptiens : apports d’un codage par n-grammes, et de représentations multidimensionnelles graduées
auteur: Bruno Delprat, Martine Cadot, Alain Lelu
article: JADT 2024 – 17es Journées internationales d’Analyse statistique des Données Textuelles, SeSLa (Séminaire des Sciences du Langage de l’UCLouvain – Site Saint-Louis), en collaboration avec le LASLA (Laboratoire d’Analyse statistique des Langues anciennes de l’Université de Liège), Jun 2024, Bruxelles, Belgique
Accès au texte intégral et bibtex

titre: RoboVox: A Single/Multi-channel Far-field Speaker Recognition Benchmark for a Mobile Robot
auteur: Mohammad Mohammadamini, Driss Matrouf, Michael Rouvier, Jean-Francois Bonastre, Romain Serizel, Theophile Gonos
article: LREC_COLING, ELRA, May 2024, Turino, Italy
Accès au texte intégral et bibtex

titre: A weighted-variance variational autoencoder model for speech enhancement
auteur: Ali Golmakani, Mostafa Sadeghi, Xavier Alameda-Pineda, Romain Serizel
article: ICASSP 2024 – International Conference on Acoustics Speech and Signal Processing, IEEE, Apr 2024, Seoul (Korea), South Korea. pp.1-5
Accès au texte intégral et bibtex

titre: Unsupervised speech enhancement with diffusion-based generative models
auteur: Berné Nortier, Mostafa Sadeghi, Romain Serizel
article: International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Apr 2024, Seoul (Korea), South Korea. ⟨10.48550/arXiv.2309.10450⟩
Accès au texte intégral et bibtex

titre: Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder
auteur: Mostafa Sadeghi, Romain Serizel
article: International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Apr 2024, Seoul (Korea), South Korea. ⟨10.48550/arXiv.2309.10439⟩
Accès au texte intégral et bibtex

titre: Diffusion-based speech enhancement with a weighted generative-supervised learning loss
auteur: Jean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel
article: International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Apr 2024, Seoul (Korea), South Korea. ⟨10.48550/arXiv.2309.10457⟩
Accès au texte intégral et bibtex

Book sections

titre: STATE OF THE ART
auteur: Guillaume Coiffier, Sewade Ogun, Leo Valque, Priyansh Trivedi
article: THINK BEFORE LOADING, 2024, 978-2-9591975-0-5
Accès au texte intégral et bibtex

Documents associated with scientific events

titre: The Voice Privacy 2024 Challenge Evaluation Plan
auteur: Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco
article: 4th Symposium on Security and Privacy in Speech Communication 2024, Sep 2024, Kos Island, Greece
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications
auteur: Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent
article: 2024
Accès au texte intégral et bibtex

titre: Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge
auteur: Simon Leglaive, Matthieu Fraticelli, Hend ElGhazaly, Léonie Borne, Mostafa Sadeghi, Scott Wisdom, Manuel Pariente, John R. Hershey, Daniel Pressnitzer, Jon P. Barker
article: 2024
Accès au texte intégral et bibtex

2023

Journal articles

titre: AVI-Corse : méthodologie et enjeux d’un projet participatif. Des avatars numériques au service du langage et de la communication.
auteur: Agnès Piquard-Kipffer, Karen Martinelli, Léa Dussere, Anne Sancier, Jérémy Zytnicki, Caroline Barbot-Bouzit, Slim Ouni
article: La Nouvelle revue Éducation et société inclusives, A paraître
Accès au texte intégral et bibtex

titre: Super-Resolved Dynamic 3D Reconstruction of the Vocal Tract during Natural Speech
auteur: Karyna Isaieva, Freddy Odille, Yves Laprie, Guillaume Drouot, Jacques Felblinger, Pierre-André Vuissoz
article: Journal of Imaging, 2023, 9 (10), pp.233. ⟨10.3390/jimaging9100233⟩
Accès au texte intégral et bibtex

titre: Stuttering Detection Using Speaker Representations and Self-supervised Contextual Embeddings
auteur: Shakeel Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: International Journal of Speech Technology, 2023, ⟨10.1007/s10772-023-10032-1⟩
Accès au texte intégral et bibtex

titre: Expression-preserving face frontalization improves visually assisted speech processing
auteur: Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda
article: International Journal of Computer Vision, 2023, 131 (5), pp.1122-1140. ⟨10.1007/s11263-022-01742-1⟩
Accès au texte intégral et bibtex

titre: Algorithms for audio inpainting based on probabilistic nonnegative matrix factorization
auteur: Ondřej Mokrý, Paul Magron, Thomas Oberlin, Cédric Févotte
article: Signal Processing, 2023, ⟨10.1016/j.sigpro.2022.108905⟩
Accès au texte intégral et bibtex

titre: Privacy in Speech and Language Technology
auteur: Simone Fischer-Hübner, Dietrich Klakow, Peggy Valcke, Emmanuel Vincent
article: Dagstuhl Reports, 2023, 12 (8), pp.60-102. ⟨10.4230/DagRep.12.8.60⟩
Accès au texte intégral et bibtex

titre: Advancing Stuttering Detection via Data Augmentation, Class-Balanced Loss and Multi-Contextual Deep Learning
auteur: Shakeel Ahmad Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: IEEE Journal of Biomedical and Health Informatics, 2023, ⟨10.1109/JBHI.2023.3248281⟩
Accès au texte intégral et bibtex

titre: Cross-corpora spoken language identification with domain diversification and generalization
auteur: Spandan Dey, Md Sahidullah, Goutam Saha
article: Computer Speech and Language, 2023, 81 (June 2023), pp.101489. ⟨10.1016/j.csl.2023.101489⟩
Accès au texte intégral et bibtex

titre: Guest editorial: Special issue on advances in deep learning based speech processing
auteur: Xiaolei Zhang, Lei Xie, Eric Fosler-Lussier, Emmanuel Vincent
article: Neural Networks, 2023, 158, ⟨10.1016/j.neunet.2022.11.033⟩
Accès au texte intégral et bibtex

titre: Differentially private speaker anonymization
auteur: Ali Shahin Shamsabadi, Brij Mohan Lal Srivastava, Aurélien Bellet, Nathalie Vauquier, Emmanuel Vincent, Mohamed Maouche, Marc Tommasi, Nicolas Papernot
article: Proceedings on Privacy Enhancing Technologies, 2023, 2023 (1), ⟨10.48550/arXiv.2202.11823⟩
Accès au bibtex

titre: Modulation spectral features for speech emotion recognition using deep neural networks
auteur: Premjeet Singh, Md Sahidullah, Goutam Saha
article: Speech Communication, 2023, 146 (January), pp.53-69. ⟨10.1016/j.specom.2022.11.005⟩
Accès au texte intégral et bibtex

Conference papers

titre: End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis
auteur: Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent
article: 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2023), Dec 2023, Taipei, Taiwan. ⟨10.1109/ASRU57964.2023.10389729⟩
Accès au texte intégral et bibtex

titre: From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion
auteur: Robin San Roman, Yossi Adi, Antoine Deleforge, Romain Serizel, Gabriel Synnaeve, Alexandre Défossez
article: NeurIPS 2023 – Conference on Neural Information Processing Systems, Dec 2023, New Orleans, United States. ⟨10.48550/arXiv.2308.02560⟩
Accès au texte intégral et bibtex

titre: Find-2-Find: Multitask Learning for Anaphora Resolution and Object Localization
auteur: Cennet Oguz, Pascal Denis, Emmanuel Vincent, Simon Ostermann, Josef van Genabith
article: 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023, Singapore, Singapore
Accès au texte intégral et bibtex

titre: Pretraining Representations for Bioacoustic Few-Shot Detection using Supervised Contrastive Learning
auteur: Ilyass Moummad, Romain Serizel, Nicolas Farrugia
article: Detection and Classification of Acoustic Scenes and Events 2023, Sep 2023, TAMPERE, Finland
Accès au texte intégral et bibtex

titre: Post-Processing Independent Evaluation of Sound Event Detection Systems
auteur: Janek Ebbers, Reinhold Haeb-Umbach, Romain Serizel
article: DCASE 2023 – 8th Workshop on Detection and Classification of Acoustic Scenes and Events, Sep 2023, Tampere, Finland. ⟨10.48550/arXiv.2306.15440⟩
Accès au texte intégral et bibtex

titre: Monitoring environmental impact of DCASE systems: Why and how ?
auteur: Constance Douwes, Francesca Ronchini, Romain Serizel
article: Detection and Classification of Acoustic Scene and Events (DCASE) Workshop, Sep 2023, Tampere (Finlande), Finland
Accès au bibtex

titre: Spectrogram Inversion for Audio Source Separation via Consistency, Mixing, and Magnitude Constraints
auteur: Paul Magron, Tuomas Virtanen
article: EUSIPCO 2023, EURASIP, Sep 2023, Helsinki, Finland. ⟨10.48550/arXiv.2303.01864⟩
Accès au texte intégral et bibtex

titre: BinauRec: A dataset to test the influence of the use of room impulse responses on binaural speech enhancement
auteur: Louis Delebecque, Romain Serizel
article: EUSIPCO23, EURASIP, Sep 2023, Helsiinki, Finland. ⟨10.23919/EUSIPCO58844.2023.10289772⟩
Accès au texte intégral et bibtex

titre: Signal Inpainting from Fourier Magnitudes
auteur: Louis Bahrman, Marina Krémé, Paul Magron, Antoine Deleforge
article: EUSIPCO 2023, Sep 2023, Helsinki, Finland. ⟨10.23919/EUSIPCO58844.2023.10289727⟩
Accès au texte intégral et bibtex

titre: The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement
auteur: Simon Leglaive, Léonie Borne, Efthymios Tzinis, Mostafa Sadeghi, Matthieu Fraticelli, Scott Wisdom, Manuel Pariente, Daniel Pressnitzer, John R. Hershey
article: 7th International Workshop on Speech Processing in Everyday Environments (CHiME), Aug 2023, Dublin, Ireland. ⟨10.21437/CHiME.2023-2⟩
Accès au texte intégral et bibtex

titre: How to (Virtually) Train Your Speaker Localizer
auteur: Prerak Srivastava, Antoine Deleforge, Archontis Politis, Emmanuel Vincent
article: INTERSPEECH 2023, Aug 2023, Dublin, Ireland
Accès au texte intégral et bibtex

titre: Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditions
auteur: Sandipana Dowerah, Ajinkya Kulkarni, Romain Serizel, Denis Jouvet
article: INTERSPEECH 2023, Aug 2023, Dublin (Ireland), Ireland. pp.3849-3853, ⟨10.21437/Interspeech.2023-1890⟩
Accès au texte intégral et bibtex

titre: Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS
auteur: Sewade Ogun, Vincent Colotte, Emmanuel Vincent
article: InterSpeech 2023, Aug 2023, Dublin, Ireland
Accès au texte intégral et bibtex

titre: Modeling the temporal evolution of the vocal tract shape with deep learning
auteur: Yves Laprie, Vinicius Ribeiro, Karina Isaeva, Justine Leclere, Jacques Felblinger, Pierre-André Vuissoz
article: 20th International Congress on Phonetic Sciences, Aug 2023, Prague (CZ), Czech Republic
Accès au texte intégral et bibtex

titre: Non-pulmonic initiation in human beatboxing: a real-time MRI study
auteur: Alexis Dehais-Underdown, Paul Vignes, Lise Crevier-Buchman, Didier Demolin, Pierre-André Vuissoz, Karyna Isaieva, Marc Fauvel, Yves Laprie, Jacques Felblinger
article: 20th International Congress of Phonetic Sciences (ICPhS 2023), Aug 2023, Prague, Czech Republic
Accès au texte intégral et bibtex

titre: Performance above all ? energy consumption vs. performance for machine listening, a study on dcase task 4 baseline
auteur: Romain Serizel, Samuele Cornell, Nicolas Turpault
article: ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Jun 2023, Rhodes Island, France. pp.1-5, ⟨10.1109/ICASSP49357.2023.10095938⟩
Accès au texte intégral et bibtex

titre: Fast and efficient speech enhancement with variational autoencoders
auteur: Mostafa Sadeghi, Romain Serizel
article: International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Jun 2023, Rhodes island, Greece
Accès au texte intégral et bibtex

titre: Lightweight Annotation and Class Weight Training for Automatic Estimation of Alarm Audibility in Noise
auteur: François Effa, Romain Serizel, Jean-Pierre Arz, Nicolas Grimault
article: ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Jun 2023, Rhodes Island, Greece. pp.1-5, ⟨10.1109/ICASSP49357.2023.10094730⟩
Accès au texte intégral et bibtex

titre: SPICE+: Evaluation of automatic audio captioning systems with pre-trained language models
auteur: Félix Gontier, Romain Serizel, Christophe Cerisara
article: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023), Jun 2023, Rhodes Island, Greece
Accès au texte intégral et bibtex

titre: Audio-visual speech enhancement with a deep kalman filter generative model
auteur: Ali Golmakani, Mostafa Sadeghi, Romain Serizel
article: International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE, Jun 2023, Rhodes island, Greece
Accès au texte intégral et bibtex

titre: Improving Hate Speech Detection with Self-Attention Mechanism and Multi-Task Learning
auteur: Nicolas Zampieri, Irina Illina, Dominique Fohr
article: LTC’23 – 10th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Apr 2023, Poznan, Poland
Accès au texte intégral et bibtex

titre: Semantic Information Investigation for Transformer-based Rescoring of N-best Speech Recognition
auteur: Irina Illina, Dominique Fohr
article: LTC 2023, Apr 2023, Poznan, Poland
Accès au texte intégral et bibtex

titre: Can we use Common Voice to train a Multi-Speaker TTS system?
auteur: Sewade Ogun, Vincent Colotte, Emmanuel Vincent
article: The 2022 IEEE Spoken Language Technology Workshop (SLT 2022), Jan 2023, Doha, Qatar
Accès au texte intégral et bibtex

titre: Joint optimization of diffusion probabilistic-based multichannel speech enhancement with far-field speaker verification
auteur: Sandipana Dowerah, Romain Serizel, Denis Jouvet, M Mohammadamini, Driss Matrouf
article: IEEE SLT 2022, Jan 2023, Doha, Qatar
Accès au texte intégral et bibtex

Poster communications

titre: End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis
auteur: Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent
article: Rencontre des Jeunes Chercheurs en Parole 2023 – 10E Edition, Nov 2023, Grenoble, France
Accès au texte intégral et bibtex

Reports

titre: Robovox: Far-Field Speaker Recognition By A Mobile Robot (Evaluation Plan)
auteur: Mohammad Mohammadamini, Mickael Rouvier, Driss Matrouf, Jean-François Bonastre, Romain Serizel, Denis Jouvet, Théophile Gonos
article: Avignon Université. 2023
Accès au texte intégral et bibtex

titre: Supervised contrastive learning for pre-training bioacoustic few-shot systems
auteur: Ilyass Moummad, Romain Serizel, Nicolas Farrugia
article: IMT Atlantique; LORIA. 2023
Accès au texte intégral et bibtex

Theses

titre: Realism in virtually supervised learning for acoustic room characterization and sound source localization
auteur: Prerak Srivastava
article: Machine Learning [cs.LG]. Université de Lorraine, 2023. English. ⟨NNT : 2023LORR0184⟩
Accès au texte intégral et bibtex

titre: Deep Learning-based Speaker Verification In Real Conditions
auteur: Sandipana Dowerah
article: Computer Science [cs]. Université de Lorraine; CNRS, Inria, LORIA, 2023. English. ⟨NNT : 2023LORR0046⟩
Accès au texte intégral et bibtex

titre: Anonymizing Speech : Evaluating and Designing Speaker Anonymization Techniques
auteur: Pierre Champion
article: Artificial Intelligence [cs.AI]. Université de Lorraine, 2023. English. ⟨NNT : 2023LORR0101⟩
Accès au texte intégral et bibtex

titre: Enriching large language models with semantic lexicons and analogies
auteur: Georgios Zervakis
article: Document and Text Processing. Université de Lorraine, 2023. English. ⟨NNT : 2023LORR0039⟩
Accès au texte intégral et bibtex

titre: Deep learning for stuttering detection
auteur: Shakeel Ahmad Sheikh
article: Computer Science [cs]. Université de Lorraine, 2023. English. ⟨NNT : 2023LORR0005⟩
Accès au texte intégral et bibtex

titre: Transfer learning for abusive language detection
auteur: Tulika Bose
article: Computer Science [cs]. Université de Lorraine, 2023. English. ⟨NNT : 2023LORR0019⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: End-to-end Joint Rich and Normalized ASR with a limited amount of rich training data
auteur: Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent
article: 2023
Accès au texte intégral et bibtex

2022

Journal articles

titre: Gridless 3D Recovery of Image Sources from Room Impulse Responses
auteur: Tom Sprunck, Antoine Deleforge, Yannick Privat, Cédric Foy
article: IEEE Signal Processing Letters, 2022, ⟨10.1109/LSP.2022.3224682⟩
Accès au texte intégral et bibtex

titre: An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
auteur: Spandan Dey, Md Sahidullah, Goutam Saha
article: ACM Transactions on Asian and Low-Resource Language Information Processing, 2022, 21 (6), pp.1-45. ⟨10.1145/3523179⟩
Accès au texte intégral et bibtex

titre: Machine Learning for Stuttering Identification: Review, Challenges & Future Directions
auteur: Shakeel Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: Neurocomputing, 2022, 514 (2022), pp.17. ⟨10.1016/j.neucom.2022.10.015⟩
Accès au texte intégral et bibtex

titre: Analysis of constant-Q filterbank based representations for speech emotion recognition
auteur: Premjeet Singh, Shefali Waldekar, Md Sahidullah, Goutam Saha
article: Digital Signal Processing, 2022, 130, pp.103712. ⟨10.1016/j.dsp.2022.103712⟩
Accès au texte intégral et bibtex

titre: 3D dynamic spatiotemporal atlas of the vocal tract during consonant-vowel production from 2D real time MRI
auteur: Ioannis K Douros, Yu Xie, Chrysanthi Dourou, Karyna Isaieva, Pierre-Andre Vussoz, Jacques Felblinger, Yves Laprie
article: Journal of Imaging, 2022, Special Issue Spatio-Temporal Biomedical Image Analysis, 8 (9), pp.227. ⟨10.3390/jimaging8090227⟩
Accès au texte intégral et bibtex

titre: Robust acoustic domain identification with its application to speaker diarization
auteur: A Kishore Kumar, Shefali Waldekar, Md Sahidullah, Goutam Saha
article: International Journal of Speech Technology, 2022, 25 (December), pp.933-945. ⟨10.1007/s10772-022-09990-9⟩
Accès au texte intégral et bibtex

titre: The VoicePrivacy 2020 Challenge: Results and findings
auteur: Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O’Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche
article: Computer Speech and Language, 2022, 74, pp.101362. ⟨10.1016/j.csl.2022.101362⟩
Accès au texte intégral et bibtex

titre: Privacy and utility of x-vector based speaker anonymization
auteur: Brij Mohan Lal Srivastava, Mohamed Maouche, Md Sahidullah, Emmanuel Vincent, Aurélien Bellet, Marc Tommasi, Natalia Tomashenko, Xin Wang, Junichi Yamagishi
article: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2022, ⟨10.1109/TASLP.2022.3190741⟩
Accès au texte intégral et bibtex

titre: A majorization-minimization algorithm for nonnegative binary matrix factorization
auteur: Paul Magron, Cédric Févotte
article: IEEE Signal Processing Letters, 2022, ⟨10.1109/LSP.2022.3187368⟩
Accès au texte intégral et bibtex

titre: Automatic generation of the complete vocal tract shape from the sequence of phonemes to be articulated
auteur: Vinicius Ribeiro, Karyna Isaieva, Justine Leclere, Pierre-André Vuissoz, Yves Laprie
article: Speech Communication, 2022, 141, pp.1-13. ⟨10.1016/j.specom.2022.04.004⟩
Accès au texte intégral et bibtex

titre: Non-Smooth Regularization: Improvement to Learning Framework through Extrapolation
auteur: Sajjad Amini, Mohammad Soltanian, Mostafa Sadeghi, Shahrokh Ghaemmaghami
article: IEEE Transactions on Signal Processing, 2022, 70, pp.1213 – 1223. ⟨10.1109/TSP.2022.3154969⟩
Accès au texte intégral et bibtex

titre: Overlapped speech detection and speaker counting using distant microphone arrays
auteur: Samuele Cornell, Maurizio Omologo, Stefano Squartini, Emmanuel Vincent
article: Computer Speech and Language, 2022, 72, pp.101306. ⟨10.1016/j.csl.2021.101306⟩
Accès au texte intégral et bibtex

titre: Neural content-aware collaborative filtering for cold-start music recommendation
auteur: Paul Magron, Cédric Févotte
article: Data Mining and Knowledge Discovery, 2022, ⟨10.1007/s10618-022-00859-8⟩
Accès au texte intégral et bibtex

titre: Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval
auteur: Pierre-Hugo Vial, Paul Magron, Thomas Oberlin, Cédric Févotte
article: IEEE Signal Processing Letters, 2022, 29, pp.1619-1623. ⟨10.1109/LSP.2022.3189275⟩
Accès au texte intégral et bibtex

Conference papers

titre: An analogy based approach for solving target sense verification
auteur: Georgios Zervakis, Emmanuel Vincent, Miguel Couceiro, Marc Schoenauer, Esteban Marquer
article: NLPIR 2022 – 6th International Conference on Natural Language Processing and Information Retrieval, Dec 2022, Bangkok, Thailand
Accès au texte intégral et bibtex

titre: Transferring Knowledge via Neighborhood-Aware Optimal Transport for Low-Resource Hate Speech Detection
auteur: Tulika Bose, Irina Illina, Dominique Fohr
article: Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP), Nov 2022, Online, Taiwan
Accès au texte intégral et bibtex

titre: Chop and change: Anaphora resolution in instructional cooking videos
auteur: Cennet Oguz, Ivana Kruijff-Korbayová, Pascal Denis, Emmanuel Vincent, Josef van Genabith
article: Findings of AACL-IJCNLP 2022 – 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics – 12th International Joint Conference on Natural Language Processing, Nov 2022, Taipeh, Taiwan
Accès au texte intégral et bibtex

titre: Integrating isolated examples with weakly-supervised sound event detection: a direct approach
auteur: Mohammad Abdollahi, Romain Serizel, Alain Rakotomamonjy, Gilles Gasso
article: 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), Nov 2022, Nancy, France
Accès au texte intégral et bibtex

titre: Local time-frequency fading
auteur: Ama Marina Kreme, Bruno Torrésani, Antoine Deleforge
article: ICA 22 – International Congress on Acoustics 2022, Oct 2022, Gyeongju, South Korea
Accès au texte intégral et bibtex

titre: Accelerating the Centerline Processing of Vocal Tract Shapes for Articulatory Synthesis
auteur: Romain Karpinski, Vinicius Ribeiro, Yves Laprie
article: ICA 2022- 24th International Congress on Acoustics, Oct 2022, Gyeongyu, South Korea
Accès au texte intégral et bibtex

titre: How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
auteur: Sandipana Dowerah, Romain Serizel, Denis Jouvet, Mohammad Mohammadamini, Driss Matrouf
article: 4th International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI’ 2022), Oct 2022, Corfu, Greece
Accès au texte intégral et bibtex

titre: End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge
auteur: Shakeel A Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE), Oct 2022, Lisbon, Portugal
Accès au texte intégral et bibtex

titre: Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection
auteur: Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
article: COLING 2022 – Proceedings of the 29th International Conference on Computational Linguistics, Oct 2022, Gyeongju, South Korea
Accès au texte intégral et bibtex

titre: Barlow Twins self-supervised learning for robust speaker recognition
auteur: Mohammad Mohammadamini, Driss Matrouf, Jean-François A Bonastre, Sandipana Dowerah, Romain Serizel, Denis Jouvet
article: Interspeech 2022 – Human and Humanizing Speech Technology, Sep 2022, Incheon, South Korea. ⟨10.21437/Interspeech.2022-11301⟩
Accès au texte intégral et bibtex

titre: Enhancing speech privacy with slicing
auteur: Mohamed Maouche, Brij Mohan Lal Srivastava, Nathalie Vauquier, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent
article: Interspeech 2022 – Human and Humanizing Speech Technology, Sep 2022, Incheon, South Korea
Accès au texte intégral et bibtex

titre: A Sparsity-promoting Dictionary Model for Variational Autoencoders
auteur: Mostafa Sadeghi, Paul Magron
article: INTERSPEECH 2022, Sep 2022, Incheon, South Korea
Accès au texte intégral et bibtex

titre: Are disentangled representations all you need to build speaker anonymization systems?
auteur: Pierre Champion, Denis Jouvet, Anthony Larcher
article: INTERSPEECH 2022 – Human and Humanizing Speech Technology, Sep 2022, incheon, South Korea
Accès au texte intégral et bibtex

titre: Autoencoder-Based Tongue Shape Estimation During Continuous Speech
auteur: Vinicius Ribeiro, Yves Laprie
article: 23rd INTERSPEECH Conference on “Human and Humanizing Speech Technology”, Sep 2022, Incheon, South Korea
Accès au texte intégral et bibtex

titre: Analysis of expressivity transfer in non-autoregressive end-to-end multispeaker TTS systems
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: INTERSPEECH 2022, Sep 2022, Incheon, South Korea
Accès au texte intégral et bibtex

titre: Exploration of Multi-Corpus Learning for Hate Speech Classification in Low Resource Scenarios
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr, Awais Akbar
article: TSD 2022 – 25th International Conference on Text, Speech and Dialogue, Sep 2022, Brno, Czech Republic
Accès au texte intégral et bibtex

titre: Vers un système embarqué de classification d’événements sonores : étude de l’impact de la quantification des descripteurs
auteur: Marie-Anne Lacroix, Nancy Bertin, Romuald Rocher, Pascal Scalart
article: GRETSI 2022 XXVIIIème Colloque Francophone de Traitement du Signal et des Images, Sep 2022, Nancy, France
Accès au texte intégral et bibtex

titre: Realistic sources, receivers and walls improve the generalisability of virtually-supervised blind acoustic parameter estimators
auteur: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent
article: 17th International Workshop on Acoustic Signal Enhancement (IWAENC), Sep 2022, Bamberg, Germany
Accès au texte intégral et bibtex

titre: Geometry-Informed Estimation of Surface Absorption Profiles from Room Impulse Responses
auteur: Stéphane Dilungana, Antoine Deleforge, Cédric Foy, Sylvain Faisan
article: 30th European Signal Processing Conference (EUSIPCO), Aug 2022, Belgrade, Serbia. pp.867-871, ⟨10.23919/EUSIPCO55093.2022.9909667⟩
Accès au texte intégral et bibtex

titre: Multi-stage attention for fine-grained expressivity transfer in multispeaker text-to-speech system
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: EUSIPCO 2022, Aug 2022, Belgrade, Serbia
Accès au texte intégral et bibtex

titre: Robust Stuttering Detection via Multi-task and Adversarial Learning
auteur: Shakeel Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: EUSIPCO 2022 – 30th European Signal Processing Conference, Aug 2022, Belgrade, Serbia
Accès au texte intégral et bibtex

titre: A Comprehensive Exploration of Noise Robustness and Noise Compensation in ResNet and TDNN-based Speaker Recognition Systems
auteur: Mohammad Mohammadamini, Driss Matrouf, Jean-François Bonastre, Sandipana Dowerah, Romain Serizel, Denis Jouvet
article: EUSIPCO 2022 – 30th European Signal Processing Conference, Aug 2022, Belgrade, Serbia
Accès au texte intégral et bibtex

titre: Synchronization of speech and gestures in an interactional context (SyncoGest Project)
auteur: Domitille Caillat, Ludovic Marin, Christelle Dodane, Fabrice Hirsch, Slim Ouni, Pierre Slangen, Patrice Guyot, Vincent Colotte, Aliyah Morgenstern, Louis Abel, Mickaëlla Grondin-Verdon, Juliette Lozano Goupil
article: ISGS 2022 – 9th Conference of the International Society for Gesture Studies, Jul 2022, Chicago, United States
Accès au bibtex

titre: Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: Odyssey 2022 – The Speaker and Language Recognition Workshop, Jun 2022, Beijing, China. pp.85-91, ⟨10.21437/Odyssey.2022-12⟩
Accès au texte intégral et bibtex

titre: Identification des Expressions Polylexicales dans les Tweets
auteur: Nicolas Zampieri, Carlos Ramisch, Irina Illina, Dominique Fohr
article: RECITAL 2022- Traitement Automatique des Langues Naturelles (TALN), Jun 2022, Avignon, France
Accès au texte intégral et bibtex

titre: Adapting Language Models When Training on Privacy-Transformed Data
auteur: Mehmet Ali Tugtekin Turan, Dietrich Klakow, Emmanuel Vincent, Denis Jouvet
article: LREC 2022 – 13th Language Resources and Evaluation Conference, Jun 2022, Marseille, France
Accès au texte intégral et bibtex

titre: Identification of Multiword Expressions in Tweets for Hate Speech Detection
auteur: Nicolas Zampieri, Carlos Ramisch, Irina Illina, Dominique Fohr
article: LREC 2022 – 13th Edition of its Language Resources and Evaluation Conference, Jun 2022, Marseille, France
Accès au texte intégral et bibtex

titre: Transformer versus LSTM Language Models Trained on Uncertain ASR Hypotheses in Limited Data Scenarios
auteur: Imran Ahamad Sheikh, Emmanuel Vincent, Irina Illina
article: LREC 2022 – 13th Language Resources and Evaluation Conference, Jun 2022, Marseille, France
Accès au texte intégral et bibtex

titre: Placing M-Phasis on the Plurality of Hate: A Feature-Based Corpus of Hate Online
auteur: Dana Ruiter, Liane Reiners, Ashwin Geet d’Sa, Thomas Kleinbauer, Dominique Fohr, Irina Illina, Dietrich Klakow, Christian Schemer, Angeliki Monnier
article: LREC 2022 – 13th Language Resources and Evaluation Conference, Jun 2022, Marseille, France. pp.791-804
Accès au texte intégral et bibtex

titre: Anonymisation de parole par quantification vectorielle
auteur: Pierre Champion, Denis Jouvet, Anthony Larcher
article: JEP 2022 – Journées d’Études sur la Parole, Jun 2022, Île de Noirmoutier, France
Accès au texte intégral et bibtex

titre: La vélocité des mouvements labiaux et mandibulaires : un indice pour différencier les disfluences typiques du bégaiement et les disfluences normales ? Une étude pilote
auteur: Fabrice Hirsch, Ivana Didirková, Slim Ouni, Shakeel Ahmad Sheikh, Yves Laprie, Marie-Claude Monfrais-Pfauwadel, Eléonor Burkhardt
article: 34emes Journées d’Etudes sur la Parole – JEP2022, Jun 2022, Île de Noirmoutier, France. ⟨10.21437/JEP.2022-18⟩
Accès au bibtex

titre: Analyse de l’anonymisation du locuteur sur de la parole émotionnelle
auteur: Hubert Nourtel, Pierre Champion, Denis Jouvet, Anthony Larcher, Marie Tahon
article: JEP 2022 – Journées d’Études sur la Parole, Jun 2022, Île de Noirmoutier, France
Accès au texte intégral et bibtex

titre: BERT Semantic Context Model for Efficient Speech Recognition
auteur: Irina Illina, Dominique Fohr
article: ICCAS 2022 – International Conference on Cognitive Aircraft Systems, ISAE-SUPAERO, Jun 2022, Toulouse, France
Accès au bibtex

titre: Baselines and Protocols for Household Speaker Recognition
auteur: Alexey Sholokhov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: The Speaker and Language Recognition Workshop (Odyssey 2022), Jun 2022, Beijing, China. pp.185-192, ⟨10.21437/Odyssey.2022-26⟩
Accès au texte intégral et bibtex

titre: Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion
auteur: Hye-Jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-Weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen, Nicholas Evans
article: Odyssey 2022 – The Speaker and Language Recognition Workshop, Jun 2022, Beijing, China
Accès au bibtex

titre: Threshold independent evaluation of sound event detection scores
auteur: Janek Ebbers, Reinhold Haeb-Umbach, Romain Serizel
article: ICASSP 2022 – IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapore, Singapore. ⟨10.1109/ICASSP43922.2022.9747556⟩
Accès au texte intégral et bibtex

titre: Learnable Nonlinear Compression for Robust Speaker Verification
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: ICASSP 2022 – IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapore, Singapore. ⟨10.1109/ICASSP43922.2022.9747185⟩
Accès au texte intégral et bibtex

titre: Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection
auteur: Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
article: ACL 2022 – 60th meeting Association for Computational Linguistics Findings, May 2022, Dublin, Ireland. ⟨10.18653/v1/2022.findings-acl.32⟩
Accès au texte intégral et bibtex

titre: A benchmark of state-of-the-art sound event detection systems evaluated on synthetic soundscapes
auteur: Francesca Ronchini, Romain Serizel
article: ICASSP 2022 – IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapore/Virtual, Singapore. ⟨10.1109/ICASSP43922.2022.9747577⟩
Accès au texte intégral et bibtex

titre: On the impact of normalization strategies in unsupervised adversarial domain adaptation for acoustic scene classification
auteur: Michel Olvera, Emmanuel Vincent, Gilles Gasso
article: ICASSP 2022 – IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapore, Singapore. ⟨10.1109/ICASSP43922.2022.9747540⟩
Accès au texte intégral et bibtex

titre: The Impact of Removing Head Movements on Audio-visual Speech Enhancement
auteur: Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar
article: ICASSP 2022 – IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE Signal Processing Society, May 2022, Singapore, Singapore. pp.1-5, ⟨10.1109/ICASSP43922.2022.9746401⟩
Accès au texte intégral et bibtex

titre: Perception of German fricatives by French dyslexic subjects
auteur: Stéphanie Deckert, Agnès Piquard-Kipffer, Anne Bonneau
article: New Sounds 2022, 10th International Symposium on the Acquisition of Second Language Speech, Apr 2022, Barcelone, Spain
Accès au bibtex

titre: Evaluation de l’audibilité ressentie des alarmes sonores dans le bruit
auteur: Jean-Pierre Arz, François Effa, Nicolas Grimault, Romain Serizel
article: 16ème Congrès Français d’Acoustique, CFA2022, Société Française d’Acoustique; Laboratoire de Mécanique et d’Acoustique, Apr 2022, Marseille, France
Accès au bibtex

titre: Modélisation de la détection d’alarmes sonores dans le bruit
auteur: François Effa, Jean-Pierre Arz, Nicolas Grimault, Ossen El Sawaf, Romain Serizel
article: 16ème Congrès Français d’Acoustique, CFA2022, Société Française d’Acoustique; Laboratoire de Mécanique et d’Acoustique, Apr 2022, Marseille, France
Accès au bibtex

titre: Reconstruction de la forme d’une pièce par super-résolution à l’aide de réponses impulsionnelles
auteur: Tom Sprunck, Khaoula Chahdi, Cédric Foy, Emmanuel Franck, Antoine Deleforge
article: 16ème Congrès Français d’Acoustique, CFA2022, Société Française d’Acoustique; Laboratoire de Mécanique et d’Acoustique, Apr 2022, Marseille, France
Accès au bibtex

Habilitation à diriger des recherches

titre: Contributions to speech processing and ambient sound analysis
auteur: Romain Serizel
article: Computer Science [cs]. Université de Lorraine, 2022
Accès au texte intégral et bibtex

Proceedings

titre: Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022)
auteur: Mathieu Lagrange, Annamaria Mesaros, Thomas Pellegrini, Gael Richard, Romain Serizel, Dan Stowell
article: Tampere University, pp.1-225, 2022, 978-952-03-2677-7
Accès au texte intégral et bibtex

Theses

titre: Robust sound event detection
auteur: Michel Olvera
article: Computer Science [cs]. Université de Lorraine, 2022. English. ⟨NNT : 2022LORR0324⟩
Accès au texte intégral et bibtex

titre: Expressivity transfer in deep learning based text-to-speech synthesis
auteur: Ajinkya Kulkarni
article: Machine Learning [cs.LG]. Université de Lorraine, 2022. English. ⟨NNT : 2022LORR0122⟩
Accès au texte intégral et bibtex

titre: Expanding the training data for neural network based hate speech classification
auteur: Ashwin Geet d’Sa
article: Computer Science [cs]. Université de Lorraine, 2022. English. ⟨NNT : 2022LORR0081⟩
Accès au texte intégral et bibtex

titre: Exploitation de transcriptions bruitées pour la reconnaissance automatique de la parole
auteur: Adrien Dufraux
article: Informatique [cs]. Université de Lorraine, 2022. Français. ⟨NNT : 2022LORR0032⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Supplementary material to the paper The VoicePrivacy 2020 Challenge: Results and findings
auteur: Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O’Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche
article: 2022
Accès au texte intégral et bibtex

titre: The VoicePrivacy 2022 Challenge Evaluation Plan
auteur: Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Hubert Nourtel, Pierre Champion, Massimiliano Todisco, Emmanuel Vincent, Nicholas Evans, Junichi Yamagishi, Jean-François Bonastre
article: 2022
Accès au texte intégral et bibtex

titre: Étude d’un algorithme d’optimisation pour le fading temps-fréquence
auteur: Marina Krémé, Bruno Torrésani
article: 2022
Accès au texte intégral et bibtex

titre: Towards an efficient computation of masks for multichannel speech enhancement
auteur: Louis Delebecque, Romain Serizel, Nicolas Furnon
article: 2022
Accès au texte intégral et bibtex

2021

Journal articles

titre: Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework
auteur: Nirmalya Sen, Md Sahidullah, Hemant Patil, Shyamal Kumar das Mandal, Sreenivasa Krothapalli Rao, Tapan Kumar Basu
article: International Journal of Speech Technology, 2021, 24, pp.1067-1088. ⟨10.1007/s10772-021-09862-8⟩
Accès au texte intégral et bibtex

titre: dEchorate: a Calibrated Room Impulse Response Dataset for Echo-aware Signal Processing
auteur: Diego Di Carlo, Pinchas Tandeitnik, Cédric Foy, Nancy Bertin, Antoine Deleforge, Sharon Gannot
article: EURASIP Journal on Audio, Speech, and Music Processing, 2021, 39, ⟨10.1186/s13636-021-00229-0⟩
Accès au texte intégral et bibtex

titre: Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers
auteur: Karyna Isaieva, Yves Laprie, Justine Leclère, Ioannis K Douros, Jacques Felblinger, Pierre-André Vuissoz
article: Scientific Data , 2021, 8 (1), pp.258. ⟨10.1038/s41597-021-01041-3⟩
Accès au texte intégral et bibtex

titre: A detailed study of the distributed rough set based locality sensitive hashing feature selection technique
auteur: Zaineb Chelly Dagdia, Christine Zarges
article: Fundamenta Informaticae, 2021, 182 (2), pp.111-179. ⟨10.3233/FI-2021-2069⟩
Accès au texte intégral et bibtex

titre: Enabling voice-based apps with European values
auteur: Akira Campbell, Thomas Kleinbauer, Marc Tommasi, Emmanuel Vincent
article: ERCIM News, 2021, 126, pp.38-39
Accès au bibtex

titre: Impact of lip-reading on speech perception in French-speaking children at risk for reading failure assessed from age 5 to 7
auteur: Agnès Piquard-Kipffer, Thalia Cavadini, Liliane Sprenger-Charolles, Edouard Gentaz
article: L’Année psychologique, 2021, 121, pp.3-18. ⟨10.3917/anpsy1.212.0003⟩
Accès au texte intégral et bibtex

titre: Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement
auteur: Mostafa Sadeghi, Xavier Alameda-Pineda
article: IEEE Transactions on Signal Processing, 2021, 69, pp.1899-1909. ⟨10.1109/TSP.2021.3066038⟩
Accès au texte intégral et bibtex

titre: ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech
auteur: Andreas Nautsch, Xin Wang, Nicholas Evans, Tomi Kinnunen, Ville Vestman, Massimiliano Todisco, Hector Delgado, Md Sahidullah, Junichi Yamagishi, Kong Aik Lee
article: IEEE Transactions on Biometrics, Behavior, and Identity Science, 2021, 3 (2), pp.252-265. ⟨10.1109/TBIOM.2021.3059479⟩
Accès au texte intégral et bibtex

titre: Speech Frame Selection for Spoofing Detection with an Application to Partially Spoofed Audio-Data
auteur: Kishore A. Kumar, Dipjyoti Paul, Monisankha Pal, Md Sahidullah, Goutam Saha
article: International Journal of Speech Technology, 2021, ⟨10.1007/s10772-020-09785-w⟩
Accès au texte intégral et bibtex

titre: Mean absorption estimation from room impulse responses using virtually supervised learning
auteur: Cédric Foy, Antoine Deleforge, Diego Di Carlo
article: Journal of the Acoustical Society of America, 2021, 150 (2), pp.1286-1299. ⟨10.1121/10.0005888⟩
Accès au texte intégral et bibtex

titre: Optimizing Multi-Taper Features for Deep Speaker Verification
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: IEEE Signal Processing Letters, 2021, 28, pp.2187 – 2191. ⟨10.1109/LSP.2021.3122796⟩
Accès au texte intégral et bibtex

titre: DNN-based mask estimation for distributed speech enhancement in spatially unconstrained microphone arrays
auteur: Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina
article: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021, 29, pp.2310 – 2323. ⟨10.1109/TASLP.2021.3092838⟩
Accès au texte intégral et bibtex

titre: Learning emotions latent representation with CVAE for Text-Driven Expressive AudioVisual Speech Synthesis
auteur: Sara Dahmani, Vincent Colotte, Valérian Girard, Slim Ouni
article: Neural Networks, 2021, 141, pp.315-329. ⟨10.1016/j.neunet.2021.04.021⟩
Accès au texte intégral et bibtex

Conference papers

titre: Optimized Power Normalized Cepstral Coefficients Towards Robust Deep Speaker Verification
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: ASRU 2021 – IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia
Accès au texte intégral et bibtex

titre: Parameterized Channel Normalization for Far-field Deep Speaker Verification
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: ASRU 2021 – IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia
Accès au texte intégral et bibtex

titre: On the invertibility of a voice privacy system using embedding alignement
auteur: Pierre Champion, Thomas Thebaud, Gaël Le Lan, Anthony Larcher, Denis Jouvet
article: ASRU 2021 – IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2021, Cartagena, Colombia
Accès au texte intégral et bibtex

titre: Projet LogilecSur : quelles stratégies enseignantes pour guider des élèves sourds vers l’autonomie en compréhension écrite ?
auteur: Manuel Leitao, Elodie Venti, Thomas Sigiez, Christophe Laroche, Marie Perini, Agnès Piquard-Kipffer
article: IDEKI 2021 – 4ème colloque international Didactiques et métiers de l’humain, Dec 2021, Pont-à-Mousson, France
Accès au texte intégral et bibtex

titre: De codes gestuo-manuels à la Langue des Signes Française : usages et enjeux à la maternelle dans le cadre des gestes professionnels inclusifs et des adaptations didactiques
auteur: Olivia Janin, Agnès Piquard-Kipffer
article: IDEKI 2021 – 4ème colloque international Didactiques et métiers de l’humain, IDEKI, Dec 2021, Pont-A-Mousson, France
Accès au texte intégral et bibtex

titre: Improving Sound Event Detection with Auxiliary Foreground-Background Classification and Domain Adaptation
auteur: Michel Olvera, Emmanuel Vincent, Gilles Gasso
article: DCASE 2021 – 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain
Accès au texte intégral et bibtex

titre: Automated audio captioning by fine-tuning bart with audioset tags
auteur: Félix Gontier, Romain Serizel, Christophe Cerisara
article: DCASE 2021 – 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain
Accès au texte intégral et bibtex

titre: The impact of non-target events in synthetic soundscapes for sound event detection
auteur: Francesca Ronchini, Romain Serizel, Nicolas Turpault, Samuele Cornell
article: DCASE 2021 – Detection and Classification of Acoustic Scenes and Events, Nov 2021, Barcelona/Virtual, Spain
Accès au texte intégral et bibtex

titre: Benchmarking and challenges in security and privacy for voice biometrics
auteur: Jean-Francois Bonastre, Hector Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah, Brij Mohan Lal Srivastava, Massimiliano Todisco, Natalia Tomashenko, Emmanuel Vincent, Xin Wang, Junichi Yamagishi
article: SPSC 2021, 1st ISCA Symposium on Security and Privacy in Speech Communication, ISCA, Nov 2021, Magdeburg, Germany. ⟨10.21437/SPSC.2021-11⟩
Accès au texte intégral et bibtex

titre: Evaluation of Speaker Anonymization on Emotional Speech
auteur: Hubert Nourtel, Pierre Champion, Denis Jouvet, Anthony Larcher, Marie Tahon
article: SPSC 2021 – 1st ISCA Symposium on Security and Privacy in Speech Communication, Nov 2021, Virtual, Germany
Accès au texte intégral et bibtex

titre: Deep Variational Generative Models for Audio-visual Speech Separation
auteur: Viet-Nhat Nguyen, Mostafa Sadeghi, Elisa Ricci, Xavier Alameda-Pineda
article: MLSP 2021 – IEEE International Workshop on Machine Learning for Signal Processing, Oct 2021, Gold Coast, Australia. ⟨10.1109/MLSP52302.2021.9596406⟩
Accès au bibtex

titre: Blind room parameter estimation using multiple multichannel speech recordings
auteur: Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent
article: WASPAA 2021 – IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2021, New Paltz, NY, United States
Accès au texte intégral et bibtex

titre: Saladnet: Self-Attentive Multisource Localization in the Ambisonics Domain
auteur: Pierre-Amaury Grumiaux, Srdan Kitić, Prerak Srivastava, Laurent Girin, Alexandre Guérin
article: WASPAA 2021 – IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, Oct 2021, New Paltz / Virtual, United States. pp.336-340, ⟨10.1109/WASPAA52581.2021.9632737⟩
Accès au texte intégral et bibtex

titre: Robust Face Frontalization For Visual Speech Recognition
auteur: Zhiqi Kang, Radu Horaud, Mostafa Sadeghi
article: ICCVW 2021 – International Conference on Computer Vision Workshops, IEEE, Oct 2021, Montreal – Virtual, Canada. pp.2485-2495, ⟨10.1109/ICCVW54120.2021.00281⟩
Accès au texte intégral et bibtex

titre: Du développement du langage aux troubles du langage et des apprentissages, enjeux, défis et perspectives
auteur: Agnès Piquard-Kipffer
article: École Doctorale Sociétés, Communication, Arts, Lettres et Langues, Université Félix Houphouët-Boigny, Oct 2021, Abidjan, Côte d’Ivoire
Accès au bibtex

titre: Covid-19 et port du masque à l’école : mise en difficulté de certains élèves
auteur: Agnès Piquard-Kipffer
article: Journée Scientifique Fédération Charles Hermite “COVID”, Fédération Charles Hermite, Sep 2021, Vandœuvre-lès-Nancy, France
Accès au bibtex

titre: Evaluating X-vector-based Speaker Anonymization under White-box Assessment
auteur: Pierre Champion, Denis Jouvet, Anthony Larcher
article: SPECOM 2021 – 23rd International Conference on Speech and Computer, Sep 2021, Saint Petersburg, Russia
Accès au texte intégral et bibtex

titre: A comparative study of different state-of-the-art NLP models for efficient automatic hate speech detection
auteur: Nicolas Zampieri, Irina Illina, Dominique Fohr
article: Comments, hate speech, disinformation and public communication regulation 2021, Sep 2021, Zagreb, Croatia
Accès au bibtex

titre: ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
auteur: Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans, Héctor Delgado
article: ASVspoof 2021 Workshop – Automatic Speaker Verification and Spoofing Coutermeasures Challenge, Sep 2021, Virtual, France
Accès au texte intégral et bibtex

titre: On Refining BERT Contextualized Embeddings using Semantic Lexicons
auteur: Georgios Zervakis, Emmanuel Vincent, Miguel Couceiro, Marc Schoenauer
article: ECML PKDD 2021 – Machine Learning with Symbolic Methods and Knowledge Graphs co-located with European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2021, Online, Spain
Accès au texte intégral et bibtex

titre: Exploring Conditional Language Model Based Data Augmentation Approaches For Hate Speech Classification
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, Dana Ruiter
article: TSD 2021 – 24th International Conference on Text, Speech and Dialogue, Sep 2021, Olomouc, Czech Republic
Accès au texte intégral et bibtex

titre: DNN-based semantic rescoring models for speech recognition
auteur: Irina Illina, Dominique Fohr
article: TSD 2021 – 24th International Conference on Text, Speech and Dialogue, Sep 2021, Olomouc, Czech Republic
Accès au texte intégral et bibtex

titre: Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
auteur: Tomi Kinnunen, Andreas Nautsch, Md Sahidullah, Nicholas Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Lee Kong Aik
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-1522⟩
Accès au texte intégral et bibtex

titre: BERT-based Semantic Model for Rescoring N-best Speech Recognition List
auteur: Dominique Fohr, Irina Illina
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-313⟩
Accès au texte intégral et bibtex

titre: Voicing assimilations by French Speakers of German in stop-fricative sequences
auteur: Anne Bonneau
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-601⟩
Accès au texte intégral et bibtex

titre: Language recognition on unknown conditions: the LORIA-Inria-MULTISPEECH system for AP20-OLR Challenge
auteur: Raphaël Duroselle, Md Sahidullah, Denis Jouvet, Irina Illina
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-276⟩
Accès au texte intégral et bibtex

titre: Towards the prediction of the vocal tract shape from the sequence of phonemes to be articulated
auteur: Vinicius Ribeiro, Karyna Isaieva, Justine Leclère, Pierre-André Vuissoz, Yves Laprie
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-184⟩
Accès au texte intégral et bibtex

titre: Data Quality as Predictor of Voice Anti-Spoofing Generalization
auteur: Bhusan Chettri, Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-1180⟩
Accès au texte intégral et bibtex

titre: Modeling and training strategies for language recognition systems
auteur: Raphaël Duroselle, Md Sahidullah, Denis Jouvet, Irina Illina
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-277⟩
Accès au texte intégral et bibtex

titre: Explaining deep learning models for speech enhancement
auteur: Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr
article: INTERSPEECH 2021, Aug 2021, Brno, Czech Republic. ⟨10.21437/Interspeech.2021-1764⟩
Accès au texte intégral et bibtex

titre: Deep scattering network for speech emotion recognition
auteur: Premjeet Singh, Goutam Saha, Md Sahidullah
article: EUSIPCO 2021 – 29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9615958⟩
Accès au texte intégral et bibtex

titre: Improving transfer of expressivity for end-to-end multispeaker text-to-speech synthesis
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: EUSIPCO 2021 – 29th European Signal Processing Conference, European Association for Signal Processing (EURASIP), Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9616249⟩
Accès au texte intégral et bibtex

titre: Attention-based distributed speech enhancement for unconstrained microphone arrays with varying number of nodes
auteur: Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina
article: EUSIPCO 2021 – 29th European Signal Processing Conference, IEEE, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9616358⟩
Accès au texte intégral et bibtex

titre: StutterNet: Stuttering Detection Using Time Delay Neural Network
auteur: Shakeel Ahmad Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
article: EUSIPCO 2021 – 29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9616063⟩
Accès au texte intégral et bibtex

titre: Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages
auteur: Spandan Dey, Goutam Saha, Md Sahidullah
article: EUSIPCO 2021 – 29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9616273⟩
Accès au texte intégral et bibtex

titre: Compensate multiple distortions for speaker recognition systems
auteur: Mohammad Mohammadamini, Driss Matrouf, Jean-Francois Bonastre, Romain Serizel, Sandipana Dowerah, Denis Jouvet
article: EUSIPCO 2021 – 29th European Signal Processing Conference, Aug 2021, Dublin / Virtual, Ireland. ⟨10.23919/EUSIPCO54536.2021.9615983⟩
Accès au texte intégral et bibtex

titre: Learning-based estimation of individual absorption profiles from a single room impulse response with known positions of source, sensor and surfaces
auteur: Stéphane Dilungana, Antoine Deleforge, Cédric Foy, Sylvain Faisan
article: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Aug 2021, Internet, United States. pp 5623–5630, ⟨10.3397/IN-2021-3186⟩
Accès au bibtex

titre: Assimilations de voisement et interférences français/allemand
auteur: Anne Bonneau
article: RéaL2 2021 – Colloque International du Réseau d’Acquisition des Langues Secondes, Jul 2021, Toulouse, France
Accès au texte intégral et bibtex

titre: GECko+: a Grammatical and Discourse Error Correction Tool
auteur: Eduardo Calò, Léo Jacqmin, Thibo Rosemplatt, Maxime Amblard, Miguel Couceiro, Ajinkya Kulkarni
article: TALN 2021 – 28e Conférence sur le Traitement Automatique des Langues Naturelles, Jun 2021, Lille / Virtual, France. pp.8-11
Accès au texte intégral et bibtex

titre: A comparative study of different features for efficient automatic hate speech detection
auteur: Nicolas Zampieri, Irina Illina, Dominique Fohr
article: IPrA 2021 – 17th International Pragmatics Conference, Jun 2021, Winterthur, Switzerland
Accès au texte intégral et bibtex

titre: Multiword Expression Features for Automatic Hate Speech Detection
auteur: Nicolas Zampieri, Irina Illina, Dominique Fohr
article: NLDB 2021 – 26th International Conference on Natural Language & Information Systems, Jun 2021, Saarbrücken/Virtual, Germany
Accès au texte intégral et bibtex

titre: Unsupervised Domain Adaptation in Cross-corpora Abusive Language Detection
auteur: Tulika Bose, Irina Illina, Dominique Fohr
article: SocialNLP 2021 – The 9th International Workshop on Natural Language Processing for Social Media, Jun 2021, Virtual, France
Accès au texte intégral et bibtex

titre: Generalisability of Topic Models in Cross-corpora Abusive Language Detection
auteur: Tulika Bose, Irina Illina, Dominique Fohr
article: NLP4IF 2021 – Workshop Censorship, Disinformation, and Propaganda, Jun 2021, Mexico city/Virtual, Mexico
Accès au texte intégral et bibtex

titre: What’s All the FUSS About Free Universal Sound Separation Data?
auteur: Scott Wisdom, Hakan Erdogan, Daniel P W Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John R Hershey
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto/Virtual, Canada. ⟨10.1109/ICASSP39728.2021.9414774⟩
Accès au texte intégral et bibtex

titre: Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement
auteur: Mostafa Sadeghi, Xavier Alameda-Pineda
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto / Virtual, Canada. pp.1-5, ⟨10.1109/ICASSP39728.2021.9414097⟩
Accès au texte intégral et bibtex

titre: Sound Event Detection and Separation: a Benchmark on Desed Synthetic Soundscapes
auteur: Nicolas Turpault, Romain Serizel, Scott Wisdom, Hakan Erdogan, John R Hershey, Eduardo Fonseca, Prem Seetharaman, Justin Salamon
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto/Virtual, Canada. ⟨10.1109/ICASSP39728.2021.9414789⟩
Accès au texte intégral et bibtex

titre: Distributed speech separation in spatially unconstrained microphone arrays
auteur: Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto / Virtual, Canada. ⟨10.1109/ICASSP39728.2021.9414758⟩
Accès au texte intégral et bibtex

titre: Improving Sound Event Detection Metrics: Insights from DCASE 2020
auteur: Giacomo Ferroni, Nicolas Turpault, Juan Azcarreta, Francesco Tuveri, Romain Serizel, Çagdaş Bilen, Sacha Krstulović
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto/Virtual, Canada. ⟨10.1109/ICASSP39728.2021.9414711⟩
Accès au texte intégral et bibtex

titre: Detecting acoustic reflectors using a robot’s ego-noise
auteur: Usama Saqib, Antoine Deleforge, Jesper Rindom Jensen
article: ICASSP 2021 – 46th International Conference on Acoustics, Speech, and Signal Processing, Jun 2021, Toronto / Virtual, Canada. ⟨10.1109/ICASSP39728.2021.9414061⟩
Accès au texte intégral et bibtex

titre: Learnable MFCCs for Speaker Verification
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: ISCAS 2021 – IEEE International Symposium on Circuits and Systems, May 2021, Daegu, South Korea. ⟨10.1109/ISCAS51556.2021.9401593⟩
Accès au texte intégral et bibtex

titre: Non-linear frequency warping using constant-Q transformation for speech emotion recognition
auteur: Premjeet Singh, Goutam Saha, Md Sahidullah
article: ICCCI 2021 – International Conference on Computer Communication and Informatics, Jan 2021, Coimbatore, India. ⟨10.1109/ICCCI50826.2021.9402569⟩
Accès au texte intégral et bibtex

titre: Domain-Dependent Speaker Diarization for the Third DIHARD Challenge
auteur: Kishore A. Kumar, Shefali Waldekar, Goutam Saha, Md Sahidullah
article: DIHARD 2021 – 3rd Speech Diarization Challenge Workshop, Jan 2021, Virtual, France
Accès au texte intégral et bibtex

titre: UIAI System for Short-Duration Speaker Verification Challenge 2020
auteur: Md Sahidullah, Achintya Kumar Sarkar, Ville Vestman, Xuechen Liu, Romain Serizel, Tomi Kinnunen, Zheng-Hua Tan, Emmanuel Vincent
article: SLT 2021 – IEEE Spoken Language Technology Workshop, IEEE, Jan 2021, Shenzhen / Virtual, China. ⟨10.1109/SLT48900.2021.9383596⟩
Accès au texte intégral et bibtex

titre: Foreground-Background Ambient Sound Scene Separation
auteur: Michel Olvera, Emmanuel Vincent, Romain Serizel, Gilles Gasso
article: EUSIPCO 2020 – 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287436⟩
Accès au texte intégral et bibtex

titre: MRI Vocal Tract Sagittal Slices Estimation during Speech Production of CV
auteur: Ioannis K Douros, Ajinkya Kulkarni, Yu Xie, Chrysanthi Dourou, Jacques Felblinger, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie
article: EUSIPCO 2020 – 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287834⟩
Accès au texte intégral et bibtex

titre: Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
auteur: Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr
article: EUSIPCO 2020 – 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩
Accès au texte intégral et bibtex

Book sections

titre: Histoire des machines parlantes
auteur: Benjamin Elie, Camille Fauth, Melissa Barkat-Defradas
article: Christelle Dodane; Claudia Schweitzer. HISTOIRE DE LA DESCRIPTION DE LA PAROLE : DE L’INTROSPECTON À L’INSTRUMENTATION, Honoré Champion, 2021, 9782745355959
Accès au bibtex

Patents

titre: Audio-driven speech animation using recurrent neutral network
auteur: Slim Ouni, Théo Biasutto–Lervat, Sara Dahmani
article: United States, Patent n° : WO2021023861. 2021
Accès au bibtex

Theses

titre: Apprentissage profond pour le rehaussement de la parole dans les antennes acoustiques ad-hoc
auteur: Nicolas Furnon
article: Informatique [cs]. Université de Lorraine, 2021. Français. ⟨NNT : 2021LORR0277⟩
Accès au texte intégral et bibtex

titre: Robustness of language recognition system to transmission channel
auteur: Raphaël Duroselle
article: Computer Science [cs]. Université de Lorraine, 2021. English. ⟨NNT : 2021LORR0250⟩
Accès au texte intégral et bibtex

titre: Implicit and explicit phase modeling in deep learning-based source separation
auteur: Manuel Pariente
article: Machine Learning [stat.ML]. Université de Lorraine, 2021. English. ⟨NNT : 2021LORR0150⟩
Accès au texte intégral et bibtex

titre: Analyse des problèmatiques liées à la reconnaissance de sons ambiants en environnement réel
auteur: Nicolas Turpault
article: Informatique [cs]. Université de Lorraine, 2021. Français. ⟨NNT : 2021LORR0108⟩
Accès au texte intégral et bibtex

titre: Modélisation de la coarticulation multimodale : vers l’animation d’une tête parlante intelligible
auteur: Théo Biasutto-Lervat
article: Intelligence artificielle [cs.AI]. Université de Lorraine, 2021. Français. ⟨NNT : 2021LORR0019⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: SAMbA: Speech enhancement with Asynchronous ad-hoc Microphone Arrays
auteur: Nicolas Furnon, Romain Serizel, Slim Essid, Irina Illina
article: 2021
Accès au texte intégral et bibtex

titre: MULTICHANNEL SPEECH ENHANCEMENT FOR SPEAKER VERIFICATION IN NOISY AND REVERBERANT ENVIRONMENTS
auteur: Sandipana Dowerah, Romain Serizel, Denis Jouvet, Mohammad Mohammadamini, Driss Matrouf
article: 2021
Accès au bibtex

titre: Analysis of weak labels for sound event tagging
auteur: Nicolas Turpault, Romain Serizel, Emmanuel Vincent
article: 2021
Accès au texte intégral et bibtex

titre: ABSP System for The Third DIHARD Challenge
auteur: Kishore A. Kumar, Shefali Waldekar, Goutam Saha, Md Sahidullah
article: 2021
Accès au texte intégral et bibtex

2020

Journal articles

titre: Classification of Hate Speech Using Deep Neural Networks
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr
article: Revue d’Information Scientifique & Technique , 2020, From Data and Information Processing to Knowledge Organization : Architectures, Models and Systems, 25 (01)
Accès au texte intégral et bibtex

titre: Peut-on faire confiance aux IA ?
auteur: Emmanuel Vincent
article: The Conversation France, 2020
Accès au bibtex

titre: Duration modelling and evaluation for Arabic statistical parametric speech synthesis
auteur: Imene Zangar, Zied Mnasri, Vincent Colotte, Denis Jouvet
article: Multimedia Tools and Applications, 2020, ⟨10.1007/s11042-020-09901-7⟩
Accès au texte intégral et bibtex

titre: ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
auteur: Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling
article: Computer Speech and Language, 2020, 64, pp.101114. ⟨10.1016/j.csl.2020.101114⟩
Accès au texte intégral et bibtex

titre: Automatic Tongue Delineation from MRI Images with a Convolutional Neural Network Approach
auteur: Karyna Isaieva, Yves Laprie, Nicolas Turpault, Alexis Houssard, Jacques Felblinger, Pierre-André Vuissoz
article: Applied Artificial Intelligence, 2020, 34 (14), pp.1115-1123. ⟨10.1080/08839514.2020.1824090⟩
Accès au bibtex

titre: Optimization of data-driven filterbank for automatic speaker verification
auteur: Susanta Sarangi, Md Sahidullah, Goutam Saha
article: Digital Signal Processing, 2020, 104, ⟨10.1016/j.dsp.2020.102795⟩
Accès au texte intégral et bibtex

titre: Some consideration on expressive audiovisual speech corpus acquisition using a multimodal platform
auteur: Sara Dahmani, Vincent Colotte, Slim Ouni
article: Language Resources and Evaluation, 2020, ⟨10.1007/s10579-020-09500-w⟩
Accès au texte intégral et bibtex

titre: Joint NN-Supported Multichannel Reduction of Acoustic Echo, Reverberation and Noise
auteur: Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert
article: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, ⟨10.1109/TASLP.2020.3008974⟩
Accès au texte intégral et bibtex

titre: Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals
auteur: Tomi Kinnunen, Héctor Delgado, Nicholas Evans, Kong-Aik Lee, Ville Vestman, Andreas Nautsch, Massimiliano Todisco, Xin Wang, Md Sahidullah, Junichi Yamagishi, Douglas A Reynolds
article: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, pp.2195 – 2210. ⟨10.1109/TASLP.2020.3009494⟩
Accès au texte intégral et bibtex

titre: A scalable and effective rough set theory-based approach for big data pre-processing
auteur: Zaineb Chelly Dagdia, Christine Zarges, Gaël Beck, Mustapha Lebbah
article: Knowledge and Information Systems (KAIS), 2020, ⟨10.1007/s10115-020-01467-y⟩
Accès au texte intégral et bibtex

titre: Measurement of Tongue Tip Velocity from Real-Time MRI and Phase-Contrast Cine-MRI in Consonant Production
auteur: Karyna Isaieva, Yves Laprie, Freddy Odille, Ioannis K Douros, Jacques Felblinger, Pierre-André Pav Vuissoz
article: Journal of Imaging, 2020, 6 (5), pp.31. ⟨10.3390/jimaging6050031⟩
Accès au texte intégral et bibtex

titre: On the Use of Artificial Malicious Patterns for Android Malware Detection
auteur: Manel Jerbi, Zaineb Chelly Dagdia, Slim Bechikh, Mohamed Makhlouf, Lamjed Ben Said
article: Computers and Security, 2020, 92, pp.101743. ⟨10.1016/j.cose.2020.101743⟩
Accès au texte intégral et bibtex

titre: Separation of Alpha-Stable Random Vectors
auteur: Mathieu Fontaine, Roland Badeau, Antoine Liutkus
article: Signal Processing, 2020, pp.107465. ⟨10.1016/j.sigpro.2020.107465⟩
Accès au texte intégral et bibtex

titre: RNN Language Model Estimation for Out-of-Vocabulary Words
auteur: Irina Illina, Dominique Fohr
article: Lecture Notes in Artificial Intelligence, 2020, 12598, ⟨10.1007/978-3-030-66527-2_15⟩
Accès au texte intégral et bibtex

Conference papers

titre: DNN-Based Parametric Speech Synthesis Enhanced With Articulatory Information
auteur: Anastasiia Tsukanova, Ioannis K Douros, Yves Laprie
article: ISSP 2020 – 12th International Seminar on Speech Production, Dec 2020, Providence / Virtual, United States
Accès au texte intégral et bibtex

titre: Synthesize MRI vocal tract data during CV production
auteur: Ioannis K Douros, Chrysanthi Dourou, Yu Xie, Jacques Felblinger, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie
article: ISSP 2020 – 12th International Seminar on Speech Production, Dec 2020, Providence / Virtual, United States
Accès au texte intégral et bibtex

titre: F1 and F2 measurements for French oral vowel with a new pneumotachograph mask
auteur: Amélie Elmerich, Angelique Amelot, Shinji Maeda, Yves Laprie, Jean Francois Papon, Lise Crevier-Buchman
article: ISSP 2020 – 12th International Seminar on Speech Production, Dec 2020, Providence / Virtual, United States
Accès au texte intégral et bibtex

titre: Tracking the tongue contours in rt-MRI films with an autoencoder DNN approach
auteur: Karyna Isaieva, Yves Laprie, Alexis Houssard, Jacques Felblinger, Pierre-André Vuissoz
article: ISSP 2020 – 12th International Seminar on Speech Production, Dec 2020, Providence / Virtual, United States
Accès au texte intégral et bibtex

titre: Vocal tract sagittal slices estimation from MRI midsagittal slices during speech production of CV
auteur: Ioannis K Douros, Yu Xie, Chrysanthi Dourou, Jacques Felblinger, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie
article: ISSP 2020 – 12th International Seminar on Speech Production, Dec 2020, Providence / Virtual, United States
Accès au texte intégral et bibtex

titre: Mean Absorption Coefficient Estimation From Impulse Responses: Deep Learning vs. Sabine
auteur: Corto Bastien, Antoine Deleforge, Cédric Foy
article: E-FA 2020 – Forum Acusticum 2020, Dec 2020, Lyon / Virtual, France. pp.2, ⟨10.48465/fa.2020.0785⟩
Accès au texte intégral et bibtex

titre: Label Propagation-Based Semi-Supervised Learning for Hate Speech Classification
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr, Dietrich Klakow, Dana Ruiter
article: Insights from Negative Results Workshop, EMNLP 2020, Nov 2020, Punta Cana, Dominican Republic
Accès au texte intégral et bibtex

titre: A Study of F0 Modification for X-Vector Based Speech Pseudo-Anonymization Across Gender
auteur: Pierre Champion, Denis Jouvet, Anthony Larcher
article: The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI)., Nov 2020, online, United States
Accès au texte intégral et bibtex

titre: Task-Aware Separation for the DCASE 2020 Task 4 Sound Event Detection and Separation Challenge
auteur: Samuele Cornell, Michel Olvera, Manuel Pariente, Giovanni Pepe, Emanuele Principi, Leonardo Gabrielli, Stefano Squartini
article: DCASE 2020 – 5th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2020, Virtual, Japan
Accès au texte intégral et bibtex

titre: Domain-Adversarial Training and Trainable Parallel Front-end for the DCASE 2020 Task 4 Sound Event Detection Challenge
auteur: Samuele Cornell, Michel Olvera, Manuel Pariente, Giovanni Pepe, Emanuele Principi, Leonardo Gabrielli, Stefano Squartini
article: DCASE 2020 – 5th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2020, Virtual, Japan
Accès au texte intégral et bibtex

titre: Unsupervised regularization of the embedding extractor for robust language identification
auteur: Raphaël Duroselle, Denis Jouvet, Irina Illina
article: Odyssey 2020 – The Speaker and Language Recognition Workshop, Nov 2020, Tokyo, Japan
Accès au texte intégral et bibtex

titre: HUMAN: Hierarchical Universal Modular ANnotator
auteur: Moritz Wolf, Dana Ruiter, Ashwin Geet d’Sa, Liane Reiners, Jan Alexandersson, Dietrich Klakow
article: EMNLP 2020 System Demonstration, Nov 2020, Punta Cana (Virtual), Dominican Republic
Accès au bibtex

titre: Improving Sound Event Detection In Domestic Environments Using Sound Separation
auteur: Nicolas Turpault, Scott Wisdom, Hakan Erdogan, John R Hershey, Romain Serizel, Eduardo Fonseca, Prem Seetharaman, Justin Salamon
article: DCASE Workshop 2020 – Detection and Classification of Acoustic Scenes and Events, Nov 2020, Tokyo / Virtual, Japan
Accès au texte intégral et bibtex

titre: Training Sound Event Detection On A Heterogeneous Dataset
auteur: Nicolas Turpault, Romain Serizel
article: DCASE Workshop, Nov 2020, Tokyo, Japan
Accès au texte intégral et bibtex

titre: Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition
auteur: Raphaël Duroselle, Denis Jouvet, Irina Illina
article: INTERSPEECH 2020, Oct 2020, Shangaï / Virtual, China
Accès au texte intégral et bibtex

titre: Transfer learning of the expressivity using flow metric learning in multispeaker text-to-speech synthesis
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: INTERSPEECH 2020, Oct 2020, Shanghai / Virtual, China
Accès au texte intégral et bibtex

titre: Correlation between prosody and pragmatics: case study of discourse markers in French and English
auteur: Lou Lee, Denis Jouvet, Katarina Bartkova, Yvon Keromnes, Mathilde Dargnat
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings
auteur: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Design Choices for X-vector Based Speaker Anonymization
auteur: Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi
article: INTERSPEECH 2020, International Speech Communication Association (ISCA), Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Achieving Multi-Accent ASR via Unsupervised Acoustic Model Adaptation
auteur: Mehmet Ali Tuğtekin Turan, Emmanuel Vincent, Denis Jouvet
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Introducing the VoicePrivacy initiative
auteur: Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Kaldi-web: An installation-free, on-device speech recognition system
auteur: Mathieu Hu, Laurent Pierron, Emmanuel Vincent, Denis Jouvet
article: INTERSPEECH 2020 Show & Tell, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Using Silence MR Image to Synthesise Dynamic MRI Vocal Tract Data of CV
auteur: Ioannis K Douros, Ajinkya Kulkarni, Chrysanthi Dourou, Yu Xie, Jacques Felblinger, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie
article: INTERSPEECH 2020, Oct 2020, Shangaï / Virtual, China
Accès au texte intégral et bibtex

titre: Asteroid: the PyTorch-based audio source separation toolkit for researchers
auteur: Manuel Pariente, Samuele Cornell, Joris Cosentino, Sunit Sivasankaran, Efthymios Tzinis, Jens Heitkaemper, Michel Olvera, Fabian-Robert Stöter, Mathieu Hu, Juan M. Martín-Doñas, David Ditter, Ariel Frank, Antoine Deleforge, Emmanuel Vincent
article: Interspeech 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: On semi-supervised LF-MMI training of acoustic models with limited data
auteur: Imran Sheikh, Emmanuel Vincent, Irina Illina
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Detecting and counting overlapping speakers in distant speech scenarios
auteur: Samuele Cornell, Maurizio Omologo, Stefano Squartini, Emmanuel Vincent
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: A comparative study of speech anonymization metrics
auteur: Mohamed Maouche, Brij Mohan Lal Srivastava, Nathalie Vauquier, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent
article: INTERSPEECH 2020, Oct 2020, Shanghai, China
Accès au texte intégral et bibtex

titre: Drone audition for search and rescue: Datasets and challenges
auteur: Antoine Deleforge
article: QUIET DRONES International Symposium on UAV/UAS Noise, Oct 2020, Paris, France
Accès au texte intégral et bibtex

titre: Deep variational metric learning for transfer of expressivity in multispeaker text to Speech
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: SLSP 2020 – 8th International Conference on Statistical Language and Speech Processing, Oct 2020, Cardiff / Virtual, United Kingdom
Accès au texte intégral et bibtex

titre: Introduction of semantic model to help speech recognition
auteur: Stephane Level, Irina Illina, Dominique Fohr
article: TSD 2020 – Twenty-third International Conference on Text, Speech and Dialogue, Sep 2020, Brno, Czech Republic
Accès au texte intégral et bibtex

titre: Embedding Formal Contexts Using Unordered Composition
auteur: Esteban Marquer, Ajinkya Kulkarni, Miguel Couceiro
article: FCA4AI – 8th International Workshop “What can FCA do for Artificial Intelligence?” (colocated wit ECAI2020), Aug 2020, Santiago de Compostela, Spain
Accès au texte intégral et bibtex

titre: Projet AMIS : résumé et traduction automatique de vidéos
auteur: Mohamed Amine Menacer, Dominique Fohr, Denis Jouvet, Karima Abidi, David Langlois, Kamel Smaïli
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d’articles internationaux, Jun 2020, Nancy, France. pp.53-56
Accès au texte intégral et bibtex

titre: Adaptation de domaine non supervisée pour la reconnaissance de la langue par régularisation d’un réseau de neurones
auteur: Raphaël Duroselle, Denis Jouvet, Irina Illina
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, Jun 2020, Nancy, France. pp.190-198
Accès au texte intégral et bibtex

titre: Étude comparative des paramètres d’entrée pour la synthèse expressive audiovisuelle de la parole par DNNs
auteur: Sara Dahmani, Vincent Colotte, Slim Ouni
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, Jun 2020, Nancy, France. pp.127-135
Accès au texte intégral et bibtex

titre: Introduction d’informations sémantiques dans un système de reconnaissance de la parole
auteur: Stephane Level, Irina Illina, Dominique Fohr
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, Jun 2020, Nancy, France. pp.362-369
Accès au texte intégral et bibtex

titre: Étude comparative de corrélats prosodiques de marqueurs discursifs français et anglais selon leur fonction pragmatique
auteur: Lou Lee, Denis Jouvet, Katarina Bartkova, Yvon Keromnes, Mathilde Dargnat
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, Jun 2020, Nancy, France. pp.335-343
Accès au texte intégral et bibtex

titre: Towards Non-Toxic Landscapes: Automatic Toxic Comment Detection Using DNN
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr
article: TRAC-2020, Second Workshop on Trolling, Aggression and Cyberbullying (LREC, 2020), May 2020, Marseille, France
Accès au texte intégral et bibtex

titre: SLOGD: Speaker Location Guided Deflation Approach to Speech Separation
auteur: Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: DNN-Based Distributed Multichannel Mask Estimation for Speech Enhancement in Microphone Arrays
auteur: Nicolas Furnon, Romain Serizel, Irina Illina, Slim Essid
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Sound event detection in synthetic domestic environments
auteur: Romain Serizel, Nicolas Turpault, Ankit Shah, Justin Salamon
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Evaluating Voice Conversion-based Privacy Protection against Informed Attackers
auteur: Brij Mohan Lal Srivastava, Nathalie Vauquier, Md Sahidullah, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, IEEE Signal Processing Society, May 2020, Barcelona, Spain. pp.2802-2806
Accès au texte intégral et bibtex

titre: Limitations of weak labels for embedding and tagging
auteur: Nicolas Turpault, Romain Serizel, Emmanuel Vincent
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: Filterbank design for end-to-end speech separation
auteur: Manuel Pariente, Samuele Cornell, Antoine Deleforge, Emmanuel Vincent
article: ICASSP 2020 – 45th International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: BLASTER: An Off-Grid Method for Blind and Regularized Acoustic Echoes Retrieval — with supplementary material
auteur: Diego Di Carlo, Clément Elvira, Antoine Deleforge, Nancy Bertin, Rémi Gribonval
article: ICASSP 2020 – IEEE International Conference on Acoustic Speech and Signal Processing, IEEE, May 2020, Barcelona, Spain
Accès au texte intégral et bibtex

titre: CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings
auteur: Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant
article: CHiME 2020 – 6th International Workshop on Speech Processing in Everyday Environments, May 2020, Barcelona / Virtual, Spain
Accès au texte intégral et bibtex

titre: Automatic rule extraction from access rules using Genetic Programming
auteur: Paloma de Las Cuevas, Pablo Garcia-Sanchez, Zaineb Chelly Dagdia, Maria-Isabel Garcia-Arenas, Juan Julian Merelo
article: EvoCOP 2020 – 20th European Conference on Evolutionary Computation in Combinatorial Optimisation, Apr 2020, Seville, Spain
Accès au texte intégral et bibtex

titre: Semantic Context Model for Efficient Speech Recognition
auteur: Stephane Level, Irina Illina, Dominique Fohr
article: ICCAS 2020 – The first International Conference on Cognitive Aircraft Systems, Mar 2020, Toulouse, France
Accès au bibtex

titre: BERT and fastText Embeddings for Automatic Detection of Toxic Speech
auteur: Ashwin Geet d’Sa, Irina Illina, Dominique Fohr
article: SIIE 2020 – Information Systems and Economic Intelligence; International Multi-Conference on:“Organization of Knowledge and Advanced Technologies”(OCTA), Feb 2020, Tunis, Tunisia
Accès au texte intégral et bibtex

titre: A brief introduction to multichannel noise reduction with deep neural networks
auteur: Romain Serizel
article: SpiN 2020 – 12th Speech in Noise Workshop, Jan 2020, Toulouse, France
Accès au texte intégral et bibtex

titre: Reconnaissance automatique de la parole : génération des prononciations non natives pour l’enrichissement du lexique
auteur: Ismael Bada, Dominique Fohr, Irina Illina
article: 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole, 2020, Nancy, France. pp.27-35
Accès au texte intégral et bibtex

Book sections

titre: Importance of Dataspace Embeddings when Evaluating Text Clustering Methods
auteur: Alain Lelu, Martine Cadot
article: Data Analysis and Rationality in a Complex World, In press
Accès au texte intégral et bibtex

titre: When Evolutionary Computing Meets Astro- and Geoinformatics
auteur: Zaineb Chelly Dagdia, Miroslav Mirchev
article: Knowledge Discovery in Big Data from Astronomy and Earth Observation, , pp.283-306, 2020
Accès au texte intégral et bibtex

Proceedings

titre: Actes de la 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 2 : Traitement Automatique des Langues Naturelles
auteur: Christophe Benzitoun, Chloé Braud, Laurine Huber, David Langlois, Slim Ouni, Sylvain Pogodalla, Stéphane Schneider
article: 2 : Traitement Automatique des Langues Naturelles, ATALA; AFCP, pp.1-395, 2020
Accès au texte intégral et bibtex

titre: Actes de la 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d’Études sur la Parole
auteur: Christophe Benzitoun, Chloé Braud, Laurine Huber, David Langlois, Slim Ouni, Sylvain Pogodalla, Stéphane Schneider
article: 1 : Journées d’Études sur la Parole, ATALA; AFCP, 2020
Accès au texte intégral et bibtex

titre: Actes de la 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL
auteur: Christophe Benzitoun, Chloé Braud, Laurine Huber, David Langlois, Slim Ouni, Sylvain Pogodalla, Stéphane Schneider
article: 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL, ATALA; AFCP, pp.1-230, 2020
Accès au texte intégral et bibtex

titre: Actes de la 6e conférence conjointe Journées d’Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d’articles internationaux
auteur: Christophe Benzitoun, Chloé Braud, Laurine Huber, David Langlois, Slim Ouni, Sylvain Pogodalla, Stéphane Schneider
article: 4 : Démonstrations et résumés d’articles internationaux, ATALA; AFCP, pp.1-88, 2020
Accès au texte intégral et bibtex

Reports

titre: Speaker information modification in the VoicePrivacy 2020 toolchain
auteur: Pierre Champion, Denis Jouvet, Anthony Larcher
article: [Research Report] INRIA Nancy, équipe Multispeech; LIUM – Laboratoire d’Informatique de l’Université du Mans. 2020
Accès au texte intégral et bibtex

titre: The VoicePrivacy 2020 Challenge Evaluation Plan
auteur: Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco
article: [0] LIA – Laboratoire Informatique d’Avignon; MULTISPEECH – Speech Modeling for Facilitating Oral-Based Communication Inria Nancy – Grand Est, LORIA – NLPKD – Department of Natural Language Processing & Knowledge Discovery; Eurecom [Sophia Antipolis]; University of Edinburgh. 2020
Accès au texte intégral et bibtex

Software

titre: voiceHome-2 corpus – automatic speech recognition baseline – scripts
auteur: Sunit Sivasankaran, Irina Illina, Emmanuel Vincent
article: 2020, ⟨swh:1:dir:e61ed9084af0d3e8542cd4ab3a990d24314a6724;origin=https://hal.archives-ouvertes.fr/hal-02963802;visit=swh:1:snp:b958e3aa64f6b1663929789c8cf28d019f55f57d;anchor=swh:1:rev:6b9bf3964385d0c16d262796d9e4a3a30a52dafd;path=/⟩
Accès au texte intégral et bibtex

Theses

titre: Echo-aware signal processing for audio scene analysis
auteur: Diego Di Carlo
article: Signal and Image processing. UNIVERSITÉ DE RENNES 1; INRIA – IRISA – PANAMA, 2020. English. ⟨NNT : ⟩
Accès au texte intégral et bibtex

titre: Reconnaissance et traduction automatique de la parole de vidéos arabes et dialectales
auteur: Mohamed Amine Menacer
article: Informatique et langage [cs.CL]. Université de Lorraine, 2020. Français. ⟨NNT : 2020LORR0157⟩
Accès au texte intégral et bibtex

titre: Synthèse audiovisuelle de la parole expressive : modélisation des émotions par apprentissage profond
auteur: Sara Dahmani
article: Informatique [cs]. Université de Lorraine, 2020. Français. ⟨NNT : 2020LORR0137⟩
Accès au texte intégral et bibtex

titre: Localization guided speech separation
auteur: Sunit Sivasankaran
article: Machine Learning [cs.LG]. Université de Lorraine, 2020. English. ⟨NNT : 2020LORR0078⟩
Accès au texte intégral et bibtex

titre: Towards a 3 dimensional dynamic generic speaker model to study geometry simplifications of the vocal tract using magnetic resonance imaging data
auteur: Ioannis K Douros
article: Computation and Language [cs.CL]. Université de Lorraine, 2020. English. ⟨NNT : 2020LORR0115⟩
Accès au texte intégral et bibtex

titre: Apprentissage profond bout-en-bout pour le rehaussement de la parole
auteur: Guillaume Carbajal
article: Informatique [cs]. Université de Lorraine, 2020. Français. ⟨NNT : 2020LORR0017⟩
Accès au texte intégral et bibtex

titre: Synthèse paramétrique de la parole Arabe
auteur: Amal Houidhek
article: Traitement du signal et de l’image [eess.SP]. Université de Lorraine; Université de Tunis El Manar (Tunisie), 2020. Français. ⟨NNT : 2020LORR0116⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: Emotion recognition from phoneme-duration information
auteur: Ajinkya Kulkarni, Ioannis K Douros, Vincent Colotte, Denis Jouvet
article: 2020
Accès au texte intégral et bibtex

titre: LibriMix: An open-source dataset for generalizable speech separation
auteur: Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, Emmanuel Vincent
article: 2020
Accès au texte intégral et bibtex

2019

Journal articles

titre: Motion planning for robot audition
auteur: van Quan Nguyen, Francis Colas, Emmanuel Vincent, François Charpillet
article: Autonomous Robots, 2019, 43 (8), pp.2293-2317. ⟨10.1007/s10514-019-09880-1⟩
Accès au texte intégral et bibtex

titre: Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition
auteur: Antoine Deleforge, Diego Di Carlo, Martin Strauss, Romain Serizel, Lucio Marcenaro
article: IEEE Signal Processing Magazine, 2019, 36 (5), pp.138-144. ⟨10.1109/MSP.2019.2924687⟩
Accès au texte intégral et bibtex

titre: Summarizing videos into a target language: Methodology, architectures and evaluation
auteur: Kamel Smaïli, Dominique Fohr, Carlos-Emiliano González-Gallardo, Michał L Grega, Lucjan Janowski, Denis Jouvet, Arian Koźbiał, David Langlois, Mikołaj Leszczuk, Odile Mella, Mohamed-Amine Menacer, Amaia Mendez, Elvys Linhares L Pontes, Eric Sanjuan, Juan-Manuel Torres-Moreno, Begona Garcia-Zapirain
article: Journal of Intelligent and Fuzzy Systems, 2019, 1, pp.1-12. ⟨10.3233/JIFS-179350⟩
Accès au texte intégral et bibtex

titre: Sound event detection in the DCASE 2017 Challenge
auteur: Annamaria Mesaros, Aleksandr Diment, Benjamin Elizalde, Toni Heittola, Emmanuel Vincent, Bhiksha Raj, Tuomas Virtanen
article: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2019, 27 (6), pp.992-1006. ⟨10.1109/TASLP.2019.2907016⟩
Accès au texte intégral et bibtex

titre: Voice Mimicry Attacks Assisted by Automatic Speaker Verification
auteur: Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, Md Sahidullah
article: Computer Speech and Language, 2019, 59, pp.36-54. ⟨10.1016/j.csl.2019.05.005⟩
Accès au texte intégral et bibtex

titre: CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings
auteur: Lauréline Perotin, Romain Serizel, Emmanuel Vincent, Alexandre Guérin
article: IEEE Journal of Selected Topics in Signal Processing, 2019, Special Issue on Acoustic Source Localization and Tracking in Dynamic Real-life Scenes, 13 (1), pp.22-33. ⟨10.1109/jstsp.2019.2900164⟩
Accès au texte intégral et bibtex

titre: VoiceHome-2, an extended corpus for multichannel speech processing in real homes
auteur: Nancy Bertin, Ewen Camberlein, Romain Lebarbenchon, Emmanuel Vincent, Sunit Sivasankaran, Irina Illina, Frédéric Bimbot
article: Speech Communication, 2019, 106, pp.68-78. ⟨10.1016/j.specom.2018.11.002⟩
Accès au texte intégral et bibtex

titre: Quality Measures for Speaker Verification with Short Utterances
auteur: Arnab Poddar, Md Sahidullah, Goutam Saha
article: Digital Signal Processing, 2019, 88, pp.66-79. ⟨10.1016/j.dsp.2019.01.023⟩
Accès au texte intégral et bibtex

titre: Learning of Hierarchical Temporal Structures for Guided Improvisation
auteur: Ken Déguernel, Emmanuel Vincent, Jérôme Nika, Gerard Assayag, Kamel Smaïli
article: Computer Music Journal, 2019, 43 (2), ⟨10.1162/comj_a_00521⟩
Accès au texte intégral et bibtex

Conference papers

titre: Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition
auteur: Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze
article: ASRU 2019 – IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2019, Singapour, Singapore
Accès au texte intégral et bibtex

titre: Grands défis scientifiques et technologiques en traitement de la parole: quelles initiatives chez Inria et au niveau européen?
auteur: Emmanuel Vincent
article: Voice Tech Paris 2019, Nov 2019, Paris, France
Accès au bibtex

titre: Regression versus classification for neural network based audio source localization
auteur: Lauréline Perotin, Alexandre Défossez, Emmanuel Vincent, Romain Serizel, Alexandre Guérin
article: WASPAA 2019 – IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, IEEE, Oct 2019, New Paltz, United States
Accès au texte intégral et bibtex

titre: Extractive Text-Based Summarization of Arabic videos: Issues, Approaches and Evaluations
auteur: M A Menacer, C E González-Gallardo, K Abidi, Dominique Fohr, Denis Jouvet, D Langlois, Odile Mella, F Sadat, J M Torres-Moreno, Kamel Smaïli
article: ICALP: International Conference on Arabic Language Processing, Oct 2019, Nancy, France. pp.65-78, ⟨10.1007/978-3-030-32959-4_5⟩
Accès au texte intégral et bibtex

titre: A Fine-grained Multilingual Analysis Based on the Appraisal Theory: Application to Arabic and English Videos
auteur: Karima Abidi, Dominique Fohr, Denis Jouvet, David Langlois, Odile Mella, Kamel Smaïli
article: ICALP: International Conference on Arabic Language Processing, Oct 2019, Nancy, France. pp.49-61, ⟨10.1007/978-3-030-32959-4_4⟩
Accès au texte intégral et bibtex

titre: COMPRISE
auteur: Emmanuel Vincent
article: META-FORUM 2019 – Cost-effective, Multilingual, Privacy-driven voice-enabled Services, Oct 2019, Bruxelles, Belgium
Accès au bibtex

titre: Sound event detection in domestic environments with weakly labeled data and soundscape synthesis
auteur: Nicolas Turpault, Romain Serizel, Ankit Parag Shah, Justin Salamon
article: Workshop on Detection and Classification of Acoustic Scenes and Events, Oct 2019, New York City, United States
Accès au texte intégral et bibtex

titre: MODALISA une plateforme intégrative pour capturer l’orchestration des gestes et de la parole
auteur: Christelle Dodane, Dominique Boutet, Fabrice Hirsch, Slim Ouni, Aliyah Morgenstern
article: Défi Instrumentation aux Limites, Colloque de restitution, CNRS, Sep 2019, Paris, France
Accès au bibtex

titre: A Multimodal Real-Time MRI Articulatory Corpus of French for Speech Research
auteur: Ioannis K Douros, Jacques Felblinger, Jens Frahm, Karyna Isaieva, Arun Joseph, Yves Laprie, Freddy Odille, Anastasiia Tsukanova, Dirk Voit, Pierre-André Vuissoz
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?
auteur: Brij Mohan Lal Srivastava, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: Towards a method of dynamic vocal tract shapes generation by combining static 3D and dynamic 2D MRI speech data
auteur: Ioannis K Douros, Anastasiia Tsukanova, Karyna Isaieva, Pierre-André Vuissoz, Yves Laprie
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: Conditional Variational Auto-Encoder for Text-Driven Expressive AudioVisual Speech Synthesis
auteur: Sara Dahmani, Vincent Colotte, Valérian Girard, Slim Ouni
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders
auteur: Manuel Pariente, Antoine Deleforge, Emmanuel Vincent
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning
auteur: Théo Biasutto–Lervat, Sara Dahmani, Slim Ouni
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection
auteur: Massimiliano Todisco, Xin Wang, Ville Vestman, Md Sahidullah, Héctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences
auteur: Kong Aik Lee, Ville Hautamäki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickaël Rouvier, Pierre-Michel Bousquet Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-François Bonastre, Chenglin Xu, Zhi Hao Lim, Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans
article: INTERSPEECH 2019 – 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria
Accès au texte intégral et bibtex

titre: An integrative platform to capture the orchestration of gesture and speech
auteur: Christelle Dodane, Dominique Boutet, Ivana Didirkova, Fabrice Hirsch, Slim Ouni, Aliyah Morgenstern
article: GeSpIn 2019 – Gesture and Speech in Interaction, Sep 2019, Paderborn, Germany
Accès au texte intégral et bibtex

titre: Speech Processing and Prosody
auteur: Denis Jouvet
article: TSD 2019 – 22nd International Conference of Text, Speech and Dialogue, Sep 2019, Ljubljana, Slovenia
Accès au texte intégral et bibtex

titre: Glottal Opening Measurements in VCV and VCCV Sequences
auteur: Benjamin Elie, Angelique Amelot, Yves Laprie, Shinji Maeda
article: ICA 2019 – 23rd International Congress on Acoustics, Sep 2019, Aachen, Germany
Accès au texte intégral et bibtex

titre: Acoustic Evaluation of Simplifying Hypotheses Used in Articulatory Synthesis
auteur: Ioannis K Douros, Yves Laprie, Pierre-André Vuissoz, Benjamin Elie
article: ICA 2019 – 23rd International Congress on Acoustics, Sep 2019, Aachen, Germany
Accès au texte intégral et bibtex

titre: Cauchy Multichannel Speech Enhancement with a Deep Speech Prior
auteur: Mathieu Fontaine, Aditya Arie Nugraha, Roland Badeau, Kazuyoshi Yoshii, Antoine Liutkus
article: EUSIPCO 2019 – 27th European Signal Processing Conference, Sep 2019, Coruña, Spain. ⟨10.23919/EUSIPCO.2019.8903091⟩
Accès au texte intégral et bibtex

titre: Evaluation of text clustering methods and their dataspace embeddings: an exploration
auteur: Alain Lelu, Martine Cadot
article: IFCS 2019 – 16th International of the Federation of Classification Societies, Aug 2019, Thessaloniki, Greece
Accès au texte intégral et bibtex

titre: Comparison between 2D and 3D models for speech production: a study of French vowels
auteur: Ioannis K Douros, Pierre-André Vuissoz, Yves Laprie
article: ICPhS 2019 – International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Effect of head posture on phonation of French vowels
auteur: Ioannis K Douros, Pierre-André Vuissoz, Yves Laprie
article: ICPhS 2019 – Proceedings of International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Can prosody meet pragmatics? Case of discourse particles in French
auteur: Lou Lee, Katarina Bartkova, Denis Jouvet, Mathilde Dargnat, Yvon Keromnes
article: ICPhS 2019 – International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Can static vocal tract positions represent articulatory targets in continuous speech? Matching static MRI captures against real-time MRI for the French language
auteur: Anastasiia Tsukanova, Ioannis K Douros, Anastasia Shimorina, Yves Laprie
article: ICPhS 2019 – International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: German obstruent sequences by French L2 learners
auteur: Anne Bonneau
article: ICPhS 2019 – International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Acoustic impacts of geometric approximation at the level of velum and epiglottis on French vowels
auteur: Ioannis K Douros, Pierre-André Vuissoz, Yves Laprie
article: ICPhS 2019 – International Congress of Phonetic Sciences, Aug 2019, Melbourne, Australia
Accès au texte intégral et bibtex

titre: Robust non-linear regression approach for generalized inverse problems in a high dimensional setting
auteur: Florence Forbes, Antoine Deleforge, Radu Horaud, Emeline Perthame
article: AIP 2019 – Applied Inverse Problem conference, Jul 2019, Grenoble, France
Accès au bibtex

titre: Sound Event Detection from Partially Annotated Data: Trends and Challenges
auteur: Romain Serizel, Nicolas Turpault
article: IcETRAN conference, Jun 2019, Srebrno Jezero, Serbia
Accès au texte intégral et bibtex

titre: Machine Translation on a parallel Code-Switched Corpus
auteur: Mohamed Menacer, David Langlois, Denis Jouvet, Dominique Fohr, Odile Mella, Kamel Smaïli
article: Canadian AI 2019 – 32nd Conference on Canadian Artificial Intelligence, May 2019, Ontario, Canada
Accès au texte intégral et bibtex

titre: Layer adaptation for transfer of expressivity in speech synthesis
auteur: Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
article: LTC’19 – 9th Language & Technology Conference, May 2019, Poznan, Poland
Accès au texte intégral et bibtex

titre: L’impact du trouble du spectre de l’autisme sur le bien-être psychologique des parents
auteur: Tamara Léonova, Géraldine Coffe, Anaïs Tarasconi, Agnès Piquard-Kipffer, Delphine Sardin, Aline Gosse, Julie Boré
article: XVIIIème Congrès de l’Association Internationale de Formation et de Recherche en Éducation Familiale, May 2019, Schoelcher, Martinique, France
Accès au bibtex

titre: Semi-supervised triplet loss based learning of ambient audio embeddings
auteur: Nicolas Turpault, Romain Serizel, Emmanuel Vincent
article: ICASSP 2019, May 2019, Brighton, United Kingdom
Accès au texte intégral et bibtex

titre: Mirage: 2D Source Localization Using Microphone Pair Augmentation with Echoes
auteur: Diego Di Carlo, Antoine Deleforge, Nancy Bertin
article: ICASSP 2019 – IEEE International Conference on Acoustic, Speech Signal Processing, May 2019, Brighton, United Kingdom. pp.775-779, ⟨10.1109/ICASSP.2019.8683534⟩
Accès au texte intégral et bibtex

titre: An improved uncertainty propagation method for robust i-vector based speaker recognition
auteur: Dayana Ribas, Emmanuel Vincent
article: ICASSP 2019 – 44th International Conference on Acoustics, Speech, and Signal Processing, May 2019, Brighton, United Kingdom
Accès au texte intégral et bibtex

titre: Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection
auteur: Tomi Kinnunen, Rosa González Hautamäki, Ville Vestman, Md Sahidullah
article: ICASSP 2019 – 44th International Conference on Acoustics, Speech, and Signal Processing, May 2019, Brighton, United Kingdom
Accès au texte intégral et bibtex

titre: F0 modeling using DNN for Arabic parametric speech synthesis
auteur: Imene Zangar, Zied Mnasri, Vincent Colotte, Denis Jouvet
article: INNSBDDL 2019 – INNS Big Data and Deep Learning, Apr 2019, Sestri Levante, Italy
Accès au texte intégral et bibtex

titre: Parole & deep learning : succès et grands défis
auteur: Emmanuel Vincent
article: Journée IA, Langage et Citoyens, Mar 2019, Nancy, France
Accès au bibtex

Book sections

titre: Introduction to Voice Presentation Attack Detection and Recent Advances
auteur: Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Kong Aik Lee
article: Sébastien Marcel; Mark S. Nixon; Julian Fierrez; Nicholas Evans. Handbook of Biometric Anti-Spoofing: Presentation Attack Detection, Springer, pp.321-361, 2019, Advances in Computer Vision and Pattern Recognition, 978-3-319-92626-1. ⟨10.1007/978-3-319-92627-8_15⟩
Accès au texte intégral et bibtex

titre: Bibliometric delineation of scientific fields
auteur: Michel Zitt, Alain Lelu, Martine Cadot, Guillaume Cabanac
article: Wolfgang Glänzel; Henk F. Moed; Ulrich Schmoch; Mike Thelwall. Handbook of Science and Technology Indicators, Springer International Publishing, pp.25-68, 2019, Handbook of Science and Technology Indicators, 978-3-030-02510-6. ⟨10.1007/978-3-030-02511-3_2⟩
Accès au texte intégral et bibtex

Reports

titre: Joint NN-Supported Multichannel Reduction of Acoustic Echo, Reverberation and Noise: Supporting Document
auteur: Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert
article: [Research Report] RR-9303, INRIA Nancy; Invoxia SAS. 2019
Accès au texte intégral et bibtex

titre: A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders : Supporting Document
auteur: Manuel Pariente, Antoine Deleforge, Emmanuel Vincent
article: [Research Report] RR-9268, INRIA. 2019, pp.1-8
Accès au texte intégral et bibtex

titre: I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences
auteur: Kong Aik Lee, Ville Hautamäki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickaël Rouvier, Pierre-Michel Bousquet Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Huy Dat Tran, Kuruvachan K George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-François Bonastre, Chenglin Xu, Zhi Hao Lim, Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas Evans
article: [Research Report] I4U Consortium. 2019
Accès au texte intégral et bibtex

titre: AI in the media and creative industries
auteur: Baptiste Caramiaux, Fabien Lotte, Joost Geurts, Giuseppe Amato, Malte Behrmann, Frédéric Bimbot, Fabrizio Falchi, Ander Garcia, Jaume Gibert, Guillaume Gravier, Hadmut Holken, Hartmut Koenitz, Sylvain Lefebvre, Antoine Liutkus, Andrew Perkis, Rafael Redondo, Enrico Turrin, Thierry Viéville, Emmanuel Vincent
article: [Research Report] New European Media (NEM). 2019, pp.1-35
Accès au texte intégral et bibtex

Software

titre: Underdetermined Reverberant Source Separation
auteur: Matthieu Kowalski, Emmanuel Vincent, Rémi Gribonval
article: 2019, ⟨swh:1:dir:ec4ae097465d9ea51589537ea94b2ea50e8d134d;origin=https://hal.archives-ouvertes.fr/hal-02309043;visit=swh:1:snp:e35494fd4cb57af0b22131ab8c4a4d8bd5cffcc6;anchor=swh:1:rev:2d23c3e68b755b720ecca8ddd5e1f8fe99909be2;path=/⟩
Accès au texte intégral et bibtex

Theses

titre: Articulatory speech synthesis
auteur: Anastasiia Tsukanova
article: Computation and Language [cs.CL]. Université de Lorraine, 2019. English. ⟨NNT : 2019LORR0166⟩
Accès au texte intégral et bibtex

titre: Localisation et rehaussement de sources de parole au format Ambisonique
auteur: Lauréline Perotin
article: Traitement du signal et de l’image [eess.SP]. Université de Lorraine, 2019. Français. ⟨NNT : 2019LORR0124⟩
Accès au texte intégral et bibtex

titre: Processus alpha-stables pour le traitement du signal
auteur: Mathieu Fontaine
article: Traitement du signal et de l’image [eess.SP]. Université de Lorraine, 2019. Français. ⟨NNT : 2019LORR0037⟩
Accès au texte intégral et bibtex

Preprints, Working Papers, …

titre: The Speed Submission to DIHARD II: Contributions & Lessons Learned
auteur: Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, Hervé Bredin, Pavel Korshunov, Alessio Brutti, Romain Serizel, Emmanuel Vincent, Nicholas Evans, Sébastien Marcel, Stefano Squartini, Claude Barras
article: 2019
Accès au texte intégral et bibtex