Speech transcription
ANTS
Multipass system for transcribing audio data, and in particular radio or TV shows. The audio stream is first split into homogeneous segments of a manageable size, and then each segment is decoded using the most adequate acoustic model with a large vocabulary continuous speech recognition engine (Julius or Sphinx).
CoALT
Software for comparing the results of several automatic labeling processes through user defined criteria.
COMPRISE Weakly Supervised STT
COMPRISE Weakly Supervised STT makes it possible to train Speech-to-Text models while reducing the need for time-consuming and expensive manual data transcription. It is part of the software tools developed by COMPRISE.
Webpage
Speech synthesis and transformation
COMPRISE Voice Transformer
The COMPRISE Voice Transformer increases voice privacy by taking speech audio as input and converting the speaker’s voice to another person’s voice. It is part of the software tools developed by COMPRISE.
Webpage
SoJA
Software for Text-To-Speech synthesis (TTS) which relies on a non uniform unit selection algorithm. It performs all steps from text to speech signal output. Moreover, a set of associated tools is available for elaborating a corpus for a TTS system (transcription, alignment…).
Webpage
Source separation and speech enhancement
Asteroid
Asteroid is a Pytorch-based audio source separation toolkit that enables fast experimentation on common datasets. It comes with a source code that supports a large range of datasets and architectures, and a set of recipes to reproduce some important papers.
- Website: https://asteroid-team.github.io/
- GitHub: https://github.com/asteroid-team/asteroid
- Paper: Asteroid: the PyTorch-based audio source separation toolkit for researchers, Manuel Pariente, Samuele Cornell, Joris Cosentino, Sunit Sivasankaran, Efthymios Tzinis, Jens Heitkaemper, Michel Olvera, Fabian-Robert Stöter, Mathieu Hu, Juan M. Martín-Doñas, David Ditte , Ariel Frank, Antoine Deleforge, Emmanuel Vincent, Interspeech 2020.
- Demo: https://joriscos-asteroid-app-demo-live-app-wly8se.streamlit.app/
FASST
Toolbox for audio source separation distributed under the Q Public License.
Code
Speech alignment
LASTAS
Software for aligning a speech signal with its corresponding orthographic transcription. Using a phonetic lexicon and automatic grapheme-to-phoneme converters, all the potential sequences of phones corresponding to the text are generated. Then, using acoustic models, the tool finds the best phone sequence and provides together the boundaries at the phone level and at the word level.
Speech visualization tools
SNOORI
Written in Java and uses signal processing algorithms developed within the WinSnoori software with the double objective of being a platform independent signal visualization and manipulation tool, and also for designing exercises for learning the prosody of a foreign language.
VisArtico
user-friendly software which allows visualizing EMA data acquired by an articulograph. This visualization software has been designed so that it can directly use the data provided by the articulograph to display the articulatory coil trajectories, synchronized with the corresponding acoustic recordings. VisArtico is very useful for the speech science community, and it makes the use of articulatory data more accessible.
Webpage
Xarticulators
Software intended to delineate contours of speech articulators in X-ray images, construct articulatory models and synthesize speech from X-ray films.
Data acquisition
JCorpusRecorder
Software for the recording of audio corpora. It provides an easy tool to record with a microphone. This software is suitable for recording sentences with information to guide the speaker.
EMA
The platform has been improved by acquiring the latest articulograph AG501 funded by the EQUIPEX ORTOLANG project.
MRI
Magnetic Resonance Imaging takes an increasing place in the investigation of speech production because it provides a complete geometrical information of the vocal tract.