End-to-End Spoken Language Understanding and Privacy Preserving Speech Processing

Speaker: Natalia Tomashenko

Date and place: January 7, 2021 at 10:30, VISIO-CONFERENCE

Abstract:

This talk is related to two different topics: (1) e2e SLU from speech and (2) privacy preserving speech processing, as well as to the discussion of challenges of these research areas and perspective research directions.

(1) E2e SLU from speech focuses on the scenario where the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model. Learning semantic information from speech is often challenging due to the lack of available semantically annotated speech corpora or insufficient size of such corpora. The performance of e2e SLU models can be substantially improved by different methods including various knowledge transfer approaches, speaker adaptation and integration of the dialog history information in the form of history vectors.

(2) Privacy preserving speech processing has become an active research area in the recent years due to the growing demand for privacy preservation. The VoicePrivacy initiative aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges. VoicePrivacy takes the form of a competitive challenge. The task of the First VoicePrivacy 2020 Challenge was to develop anonymization solutions which suppress personally identifiable information contained within speech signals. At the same time, solutions should preserve linguistic content and speech naturalness. The talk gives an overview of the First VoicePrivacy 2020 Challenge and some results.