logo Ircam

PhD Defense | MeLos: Analysis and Modelling of Speech Prosody and Speaking Style


  • Date: Thursday, June 23, 2:30pm
  • Venue: IRCAM, Igor Stravinsky Conference Room
  • Defense in English

 

 "MeLos: Analysis and Modelling of Speech Prosody and Speaking Style." June 23.

Thesis supervised by Xavier Rodet and Anne Lacheret, carried out as a member of the Analysis and Synthesis team at IRCAM.

Abstract

This thesis addresses the issue of modelling speech prosody for speech synthesis and presents MeLos: a complete system for the analysis and modelling of speech prosody, "the music of speech".

The objective of this thesis is to model the strategy, alternatives, and speaking style of a speaker for natural, expressive, and varied speech synthesis. The present study presents original contributions with special attention paid to the combination of theoretical linguistic and statistical modelling to provide a complete speech prosody system.

A unified discrete/continuous context-dependent HMM is presented to model the symbolic and the acoustic characteristics of speech prosody: 

  1. A rich description of the text characteristics based on a linguistic processing chain that includes surface and deep syntactic parsing is proposed to refine the modelling of the speech prosody in context.
  2. Segmental HMMs and Dempster-Shafer fusion are used to balance linguistic and metric constrains in the production of a pause.
  3. A trajectory model is proposed based on the stylization and the simultaneous modelling of short and long-term F0 variations over various temporal domains.

The proposed system is used to model the strategies, alternatives and speaking style of a speaker, and is extended to model the speaking style of any arbitrary number of speakers using shared-context-dependent modelling and speaker normalization techniques.

Keywords: speech prosody, speaking style, speech synthesis, discrete/continuous HMMs, stylization, trajectory modelling, linguistic analysis

Jury

Nick Campbell (Professor, CLCS - University of Dublin) reviewer
Simon King (Professor, CSTR - University of Edinburgh) reviewer
Jean-François Bonastre (Professor, LIA - University of Avignon) examiner
Eric de la Clergerie (Researcher, INRIA - ALPAGE) examiner
David Wessel (Professor, CNMAT - University of California Berkeley) examiner
Jean-Luc Zarader (Professor, ISIR - University of Paris VI) examiner
Anne Lacheret (Professor, MoDyCo - University of Paris Ouest - La Défense) supervisor
Xavier Rodet (Emeritus Researcher, IRCAM - University of Paris VI) supervisor