Article ID Journal Published Year Pages File Type
567035 Speech Communication 2014 14 Pages PDF
Abstract

Pitch is a fundamental acoustic feature of speech and as such needs to be determined during the process of speech synthesis. While a range of communicative functions are attributed to pitch variation in speech of all languages, it plays a vital role in distinguishing meaning of lexical items in tone languages. As a number of factors are assumed to affect the realisation of pitch, it is important to know which mechanisms are systematically responsible for pitch realisation in order to be able to model these effectively and thus develop robust speech synthesis systems in under-resourced environments. To this end, features influencing syllable pitch targets in continuous utterances in Yorùbá are investigated in a small speech corpus of 4 speakers. It is found that the previous syllable pitch level is strongly correlated with pitch changes between syllables and a number of approaches and features are evaluated in this context. The resulting models can be used to predict utterance pitch targets for speech synthesisers (whether it be concatenative or statistical parametric systems), and may also prove useful in speech-recognition systems.

► We investigate tone realisation in continuous utterances in Yoruba. ► A strong correlation between pitch level and pitch change between syllables is found. ► Models for predicting pitch targets on syllables are proposed and evaluated. ► Results suggest that both local and long-term features are useful when predicting pitch targets.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,