Multipass system for transcribing audio data, and in particular radio or TV shows. The audio stream is first split into homogeneous segments of a manageable size, and then each segment is decoded using the most adequate acoustic model with a large vocabulary continuous speech recognition engine (Julius or Sphinx).
Software for comparing the results of several automatic labeling processes through user defined criteria.
COMPRISE Weakly Supervised STT
COMPRISE Weakly Supervised STT makes it possible to train Speech-to-Text models while reducing the need for time-consuming and expensive manual data transcription. It is part of the software tools developed by COMPRISE.
Speech synthesis and transformation
COMPRISE Voice Transformer
The COMPRISE Voice Transformer increases voice privacy by taking speech audio as input and converting the speaker’s voice to another person’s voice. It is part of the software tools developed by COMPRISE.
Software for Text-To-Speech synthesis (TTS) which relies on a non uniform unit selection algorithm. It performs all steps from text to speech signal output. Moreover, a set of associated tools is available for elaborating a corpus for a TTS system (transcription, alignment…).
Source separation and speech enhancement
Asteroid is a Pytorch-based audio source separation toolkit that enables fast experimentation on common datasets. It comes with a source code that supports a large range of datasets and architectures, and a set of recipes to reproduce some important papers.
- Website: https://asteroid-team.github.io/
- GitHub: https://github.com/asteroid-team/asteroid
- Paper: Asteroid: the PyTorch-based audio source separation toolkit for researchers, Manuel Pariente, Samuele Cornell, Joris Cosentino, Sunit Sivasankaran, Efthymios Tzinis, Jens Heitkaemper, Michel Olvera, Fabian-Robert Stöter, Mathieu Hu, Juan M. Martín-Doñas, David Ditte , Ariel Frank, Antoine Deleforge, Emmanuel Vincent, Interspeech 2020.
- Demo: https://joriscos-asteroid-app-demo-live-app-wly8se.streamlit.app/
Toolbox for audio source separation distributed under the Q Public License.
Software for aligning a speech signal with its corresponding orthographic transcription. Using a phonetic lexicon and automatic grapheme-to-phoneme converters, all the potential sequences of phones corresponding to the text are generated. Then, using acoustic models, the tool finds the best phone sequence and provides together the boundaries at the phone level and at the word level.
Speech visualization tools
Written in Java and uses signal processing algorithms developed within the WinSnoori software with the double objective of being a platform independent signal visualization and manipulation tool, and also for designing exercises for learning the prosody of a foreign language.
user-friendly software which allows visualizing EMA data acquired by an articulograph. This visualization software has been designed so that it can directly use the data provided by the articulograph to display the articulatory coil trajectories, synchronized with the corresponding acoustic recordings. VisArtico is very useful for the speech science community, and it makes the use of articulatory data more accessible.
Software intended to delineate contours of speech articulators in X-ray images, construct articulatory models and synthesize speech from X-ray films.
Software for the recording of audio corpora. It provides an easy tool to record with a microphone. This software is suitable for recording sentences with information to guide the speaker.
The platform has been improved by acquiring the latest articulograph AG501 funded by the EQUIPEX ORTOLANG project.
Magnetic Resonance Imaging takes an increasing place in the investigation of speech production because it provides a complete geometrical information of the vocal tract.