Article ID Journal Published Year Pages File Type
566032 Speech Communication 2008 13 Pages PDF
Abstract

In this paper, we describe a novel approach to the realization of Thai speech synthesis. Spectrum, fundamental frequency (F0), and phone duration are modeled simultaneously in a unified framework of HMM, and their parameter distributions are clustered independently by using a decision-tree based context clustering technique. A group of contextual factors which affect spectrum, F0, and state duration, i.e., tone type, part of speech, are taken into account. Since Thai is a tonal language, not only intelligibility and naturalness but also correctness of synthesized tone is taken into account. To improve the correctness of tone of the synthesized speech, tone groups and tone types are used to design four different structures of decision tree in the tree-based context clustering process, including a single binary tree structure, a simple tone-separated tree structure, a constancy-based-tone-separated tree structure, and a trend-based-tone-separated tree structure. A subjective evaluation of tone correctness is conducted by using tone perception of eight Thai listeners. The simple tone-separated tree structure gives the highest level of tone correctness, while the single binary tree structure gives the lowest level of tone correctness. In addition to the tree structure, the additional contextual tone information which is applied to all structures of the decision tree achieves a significant improvement of tone correctness. Moreover, the evaluation of syllable duration distortion among the four structures shows that the constancy-based-tone-separated and the trend-based-tone-separated tree structures can alleviate the distortions that appear when using the simple tone-separated tree structure. Finally, MOS and CCR tests show that the implemented system gives the better reproduction of prosody (or naturalness, in some sense) than the unit-selection-based system with the same speech database.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,