کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
9673495 | 1452053 | 2005 | 20 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Synthesis of F0 contours using generation process model parameters predicted from unlabeled corpora: application to emotional speech synthesis
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
پردازش سیگنال
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
A corpus-based method of generating fundamental frequency (F0) contours from text was developed for Japanese. Instead of directly predicting F0 values, the method predicts command values of the F0 contour generation process model using binary decision trees. Since the model controls the F0 movement in word or in longer units, sudden undulations, unlikely in natural utterances, can be avoided even in the case of erroneous prediction. The method includes a scheme of extracting the model commands from given F0 contours, which makes it possible to prepare the corpora for training the binary decision trees automatically. Since accuracy of the extracted model commands in the training corpora is crucial for the method, constraints are applied on the location of commands. Although the method can generate any speaking styles if the corpora of the styles are available, this paper is aimed at realizing three types of emotional speech (anger, joy, and sadness) besides calm speech. The mismatches between the predicted and target contours for angry speech were similar to those for calm speech. Synthesis of emotional speech was then conducted. Phoneme durations were predicted in a similar corpus-based method, and segmental features were generated using an HMM-based speech synthesizer. A perceptual experiment was conducted for the synthesized speech, and the result indicated that anger could be conveyed well by the developed method. The result was less satisfactory for joy and sadness.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 46, Issues 3â4, July 2005, Pages 385-404
Journal: Speech Communication - Volume 46, Issues 3â4, July 2005, Pages 385-404
نویسندگان
Keikichi Hirose, Kentaro Sato, Yasufumi Asano, Nobuaki Minematsu,