Tone nucleus-based multi-level robust acoustic tonal modeling of sentential F0 variations for Chinese continuous speech tone recognition

Article ID	Journal	Published Year	Pages	File Type
9673498	Speech Communication	2005	15 Pages	PDF

Abstract

The complex F0 variations make it rather difficult to perform tone recognition of Chinese continuous speech. In this paper, we propose building robust tonal acoustic models by modeling F0 variations at different levels ranging from segmental factors to tone co-articulations and to the interplay effects among tonality, tone co-articulation and high-level prosodic event. First, we extract the tone nucleus of each tonal F0 contour in the continuous speech, and only use the features of the tone nucleus to estimate the tonal HMMs. This can protect the tonal modeling from the influences of F0 transition loci at sub-syllable levels. Second, two techniques are adopted to model local tone co-articulation variations. The left and right context dependent tri-tone HMMs estimated using tone nuclei features can model tone co-articulation effects. And the anchoring-based left and right directional normalized tonal F0 contours prove to be efficient tone discriminating features. Third, we model the interplay effects of tones and high-level prosodic events by building so-called hypo- and hyper-co-articulation-based tonal HMMs. The whole approach achieved a significantly higher performance than the conventional method when applied to a speaker dependent task.

Keywords

Tone recognition