IrcamAlign

IrcamAlign

The program IrcamAlign performs alignment for the segmentation of speech signals into phones and diphones, and calculates a measure of confidence for each phone. It will also extract the phonological structure (syllables, words and breath sequences) from the aligned sequence of phones.

Input: An audio file of speech, and optionally, a corresponding text file. Output: files of the type .lab at various levels of segmentation, for viewing in the AudioSculpt, or WaveSurfer software. IrcamAlign uses models learned from recordings. Models exist for French and English, Male and Female voices. IrcamAlign can also be used for a singing voice, however this requires the learning of specific models. It is used in particualr for creating a corpus of speech (e.g for use in text to speech synthesis). Platform: Linux. Uses the library HTK, and the software LiaPhone. Author: P. Lanchantin, A. Gonzales (adaptation for English)

Demonstrations

Speech Research at Ircam demonstrated at Collège de France 2009 - 01/09/2010
Video of a conference

Associated Software: IrcamAlign
Associated People: Pierre Lanchantin

Français