A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
567500	876090	2012	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

HMM-based speech synthesis

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis

چکیده انگلیسی

This paper proposes a technique of improving tone correctness in speech synthesis of a tonal language based on an average-voice model trained with a corpus from nonprofessional speakers’ speech. We focused on reducing tone disagreements in speech data acquired from nonprofessional speakers without manually modifying the labels. To reduce the distortion in tone caused by inconsistent tonal labeling, quantized F0 symbols were utilized as the context for F0 to obtain an appropriate F0 model. With this technique, the tonal context could be directly extracted from the original speech and this prevented inconsistency between speech data and F0 labels generated from transcriptions, which affect naturalness and the tone correctness in synthetic speech. We examined two types of labeling for the tonal context using phone-based and sub-phone-based quantized F0 symbols. Subjective and objective evaluations of the synthetic voice were carried out in terms of the intelligibility of tone and its naturalness. The experimental results from both the objective and subjective tests revealed that the proposed technique could improve not only naturalness but also the tone correctness of synthetic speech under conditions where a small amount of speech data from nonprofessional target speakers was used.

► We focused on reducing tone disagreements in nonprofessional speakers speech data.
► The quantized F0 symbols were utilized as the context for F0.
► We examined quantized F0 symbols based on phone and sub-phone boundary information.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 2, February 2012, Pages 245–255

نویسندگان

Vataya Chunwijitra, Takashi Nose, Takao Kobayashi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A tone-modeling technique using a quantized F0 context to improve tone correctness in average-voice-based speech synthesis

دسترسی سریع

ارتباط

English Website