Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
9673495	1452053	2005	20 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

HMM-based speech synthesis

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis

چکیده انگلیسی

A corpus-based method of generating fundamental frequency (F0) contours from text was developed for Japanese. Instead of directly predicting F0 values, the method predicts command values of the F0 contour generation process model using binary decision trees. Since the model controls the F0 movement in word or in longer units, sudden undulations, unlikely in natural utterances, can be avoided even in the case of erroneous prediction. The method includes a scheme of extracting the model commands from given F0 contours, which makes it possible to prepare the corpora for training the binary decision trees automatically. Since accuracy of the extracted model commands in the training corpora is crucial for the method, constraints are applied on the location of commands. Although the method can generate any speaking styles if the corpora of the styles are available, this paper is aimed at realizing three types of emotional speech (anger, joy, and sadness) besides calm speech. The mismatches between the predicted and target contours for angry speech were similar to those for calm speech. Synthesis of emotional speech was then conducted. Phoneme durations were predicted in a similar corpus-based method, and segmental features were generated using an HMM-based speech synthesizer. A perceptual experiment was conducted for the synthesized speech, and the result indicated that anger could be conveyed well by the developed method. The result was less satisfactory for joy and sadness.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 46, Issues 3â4, July 2005, Pages 385-404

نویسندگان

Keikichi Hirose, Kentaro Sato, Yasufumi Asano, Nobuaki Minematsu,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis

دسترسی سریع

ارتباط

English Website