کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567777 876155 2006 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Modeling stylized invariance and local variability of prosody in text-to-speech synthesis
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Modeling stylized invariance and local variability of prosody in text-to-speech synthesis
چکیده انگلیسی

This paper investigates the stylized invariance and local variability of prosody patterns by using a speech database containing two repetitions of 1000 sentences. The two repetitions (separated by a time span of 6 months) were recorded by a single professional speaker, who was instructed to read these sentences in the same reading style. It was observed statistically that the two repetitions have fairly wide variations in prosodic features and the variations can be up to 50% of the full dynamic range of the speaker. This shows the inadequacy of traditional prosody models that focus on capturing the universal invariance of prosody as precise as possible. In this paper, we propose to model prosody by capturing its stylized invariance and retaining local variability with a soft prediction strategy, which predicts an acceptable region rather than a single fixed point in the multi-dimensioned prosody space. A prosodic-constrained unit selection algorithm is devised under the soft prediction strategy.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 48, Issue 6, June 2006, Pages 716–726
نویسندگان
, , ,