Article ID Journal Published Year Pages File Type
567494 Speech Communication 2012 14 Pages PDF
Abstract

Spontaneous conversational speech has many characteristics that are currently not modelled well by HMM-based speech synthesis and in order to build synthetic voices that can give an impression of someone partaking in a conversation, we need to utilise data that exhibits more of the speech phenomena associated with conversations than the more generally used carefully read aloud sentences. In this paper we show that synthetic voices built with HMM-based speech synthesis techniques from conversational speech data, preserved segmental and prosodic characteristics of frequent conversational speech phenomena. An analysis of an evaluation investigating the perception of quality and speaking style of HMM-based voices confirms that speech with conversational characteristics are instrumental for listeners to perceive successful integration of conversational speech phenomena in synthetic speech. The achieved synthetic speech quality provides an encouraging start for the continued use of conversational speech in HMM-based speech synthesis.

► Using speech from conversation in HMM-based speech synthesis. ► Preserved phonetic properties of conversational speech in synthetic speech. ► More natural synthetic speech than conventional voice. ► High correlation of perceived naturalness and perceived speaking style.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,