Article ID Journal Published Year Pages File Type
9673498 Speech Communication 2005 15 Pages PDF
Abstract
The complex F0 variations make it rather difficult to perform tone recognition of Chinese continuous speech. In this paper, we propose building robust tonal acoustic models by modeling F0 variations at different levels ranging from segmental factors to tone co-articulations and to the interplay effects among tonality, tone co-articulation and high-level prosodic event. First, we extract the tone nucleus of each tonal F0 contour in the continuous speech, and only use the features of the tone nucleus to estimate the tonal HMMs. This can protect the tonal modeling from the influences of F0 transition loci at sub-syllable levels. Second, two techniques are adopted to model local tone co-articulation variations. The left and right context dependent tri-tone HMMs estimated using tone nuclei features can model tone co-articulation effects. And the anchoring-based left and right directional normalized tonal F0 contours prove to be efficient tone discriminating features. Third, we model the interplay effects of tones and high-level prosodic events by building so-called hypo- and hyper-co-articulation-based tonal HMMs. The whole approach achieved a significantly higher performance than the conventional method when applied to a speaker dependent task.
Keywords
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,