||Multipass system for transcribing audio data, and in particular radio or TV shows. The audio stream is first split into homogeneous segments of a manageable size, and then each segment is decoded using the most adequate acoustic model with a large vocabulary continuous speech recognition engine (Julius or Sphinx).
||Software for comparing the results of several automatic labeling processes through user defined criteria.
|COMPRISE Weakly Supervised STT
||COMPRISE Weakly Supervised STT makes it possible to train Speech-to-Text models while reducing the need for time-consuming and expensive manual data transcription. It is part of the software tools developed by COMPRISE.
|COMPRISE Voice Transformer
||The COMPRISE Voice Transformer increases voice privacy by taking speech audio as input and converting the speaker’s voice to another person’s voice. It is part of the software tools developed by COMPRISE.
||Deep learning based multichannel speech enhancement and source separation software. BLSTM neural networks are used to initialize and reestimate the source power spectra at every iteration of an EM algorithm. The source signals are then obtained by multichannel Wiener filtering or maximum SNR beamforming.
||Toolbox for audio source separation distributed under the Q Public License.
||Framework for source separation, proposed this year by Liutkus et al. as a new and effective approach to source separation.
||Software for aligning a speech signal with its corresponding orthographic transcription. Using a phonetic lexicon and automatic grapheme-to-phoneme converters, all the potential sequences of phones corresponding to the text are generated. Then, using acoustic models, the tool finds the best phone sequence and provides together the boundaries at the phone level and at the word level.
||Software for Text-To-Speech synthesis (TTS) which relies on a non uniform unit selection algorithm. It performs all steps from text to speech signal output. Moreover, a set of associated tools is available for elaborating a corpus for a TTS system (transcription, alignment…).