Software

Speech transcription

ANTS

Multipass system for transcribing audio data, and in particular radio or TV shows. The audio stream is first split into homogeneous segments of a manageable size, and then each segment is decoded using the most adequate acoustic model with a large vocabulary continuous speech recognition engine (Julius or Sphinx).

CoALT

Software for comparing the results of several automatic labeling processes through user defined criteria.

COMPRISE Weakly Supervised STT

COMPRISE Weakly Supervised STT makes it possible to train Speech-to-Text models while reducing the need for time-consuming and expensive manual data transcription. It is part of the software tools developed by COMPRISE.
Webpage

Speech synthesis and transformation

COMPRISE Voice Transformer

The COMPRISE Voice Transformer increases voice privacy by taking speech audio as input and converting the speaker’s voice to another person’s voice. It is part of the software tools developed by COMPRISE.
Webpage

SoJA

Software for Text-To-Speech synthesis (TTS) which relies on a non uniform unit selection algorithm. It performs all steps from text to speech signal output. Moreover, a set of associated tools is available for elaborating a corpus for a TTS system (transcription, alignment…).
Webpage

Source separation and speech enhancement

Asteroid

Asteroid is a Pytorch-based audio source separation toolkit that enables fast experimentation on common datasets. It comes with a source code that supports a large range of datasets and architectures, and a set of recipes to reproduce some important papers.

Website: https://asteroid-team.github.io/
GitHub: https://github.com/asteroid-team/asteroid
Paper: Asteroid: the PyTorch-based audio source separation toolkit for researchers, Manuel Pariente, Samuele Cornell, Joris Cosentino, Sunit Sivasankaran, Efthymios Tzinis, Jens Heitkaemper, Michel Olvera, Fabian-Robert Stöter, Mathieu Hu, Juan M. Martín-Doñas, David Ditte , Ariel Frank, Antoine Deleforge, Emmanuel Vincent, Interspeech 2020.
Demo: https://joriscos-asteroid-app-demo-live-app-wly8se.streamlit.app/

FASST

Toolbox for audio source separation distributed under the Q Public License.
Code

Speech alignment

LASTAS

Software for aligning a speech signal with its corresponding orthographic transcription. Using a phonetic lexicon and automatic grapheme-to-phoneme converters, all the potential sequences of phones corresponding to the text are generated. Then, using acoustic models, the tool finds the best phone sequence and provides together the boundaries at the phone level and at the word level.

Speech visualization tools

SNOORI

Written in Java and uses signal processing algorithms developed within the WinSnoori software with the double objective of being a platform independent signal visualization and manipulation tool, and also for designing exercises for learning the prosody of a foreign language.

VisArtico

user-friendly software which allows visualizing EMA data acquired by an articulograph. This visualization software has been designed so that it can directly use the data provided by the articulograph to display the articulatory coil trajectories, synchronized with the corresponding acoustic recordings. VisArtico is very useful for the speech science community, and it makes the use of articulatory data more accessible.
Webpage

Xarticulators

Software intended to delineate contours of speech articulators in X-ray images, construct articulatory models and synthesize speech from X-ray films.

Data acquisition

JCorpusRecorder

Software for the recording of audio corpora. It provides an easy tool to record with a microphone. This software is suitable for recording sentences with information to guide the speaker.

EMA

The platform has been improved by acquiring the latest articulograph AG501 funded by the EQUIPEX ORTOLANG project.

MRI

Magnetic Resonance Imaging takes an increasing place in the investigation of speech production because it provides a complete geometrical information of the vocal tract.