Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
566032	875912	2008	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

HMM Tone - تن Decision tree - درخت تصمیم Speech synthesis - سنتز گفتار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis

چکیده انگلیسی

In this paper, we describe a novel approach to the realization of Thai speech synthesis. Spectrum, fundamental frequency (F0), and phone duration are modeled simultaneously in a unified framework of HMM, and their parameter distributions are clustered independently by using a decision-tree based context clustering technique. A group of contextual factors which affect spectrum, F0, and state duration, i.e., tone type, part of speech, are taken into account. Since Thai is a tonal language, not only intelligibility and naturalness but also correctness of synthesized tone is taken into account. To improve the correctness of tone of the synthesized speech, tone groups and tone types are used to design four different structures of decision tree in the tree-based context clustering process, including a single binary tree structure, a simple tone-separated tree structure, a constancy-based-tone-separated tree structure, and a trend-based-tone-separated tree structure. A subjective evaluation of tone correctness is conducted by using tone perception of eight Thai listeners. The simple tone-separated tree structure gives the highest level of tone correctness, while the single binary tree structure gives the lowest level of tone correctness. In addition to the tree structure, the additional contextual tone information which is applied to all structures of the decision tree achieves a significant improvement of tone correctness. Moreover, the evaluation of syllable duration distortion among the four structures shows that the constancy-based-tone-separated and the trend-based-tone-separated tree structures can alleviate the distortions that appear when using the simple tone-separated tree structure. Finally, MOS and CCR tests show that the implemented system gives the better reproduction of prosody (or naturalness, in some sense) than the unit-selection-based system with the same speech database.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 50, Issue 5, May 2008, Pages 392–404

نویسندگان

Suphattharachai Chomphan, Takao Kobayashi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Tone correctness improvement in speaker dependent HMM-based Thai speech synthesis

دسترسی سریع

ارتباط

English Website