دانلود رایگان مقاله: به سوی بازنمودهای کاربردی غیرمستقیم از سطوح فرکانس اساسی فرکانس متغیر: سنتز ملودی گفتاری با استفاده از یادگیری تصادفی مبتنی بر مدل

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565924	1452041	2014	28 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning

ترجمه فارسی عنوان

به سوی بازنمودهای کاربردی غیرمستقیم از سطوح فرکانس اساسی فرکانس متغیر: سنتز ملودی گفتاری با استفاده از یادگیری تصادفی مبتنی بر مدل

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

مدل سازی پروتز، تقریبی هدف، رمزگذاری موازی، تجزیه و تحلیل توسط سنتز، شبیه سازی آنیل

Prosody modeling Simulated annealing - بازپخت شبیه سازی شده Analysis-by-synthesis - تجزیه و تحلیل توسط سنتز

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

به سوی بازنمودهای کاربردی غیرمستقیم از سطوح فرکانس اساسی فرکانس متغیر: سنتز ملودی گفتاری با استفاده از یادگیری تصادفی مبتنی بر مدل

چکیده انگلیسی

• High synthetic accuracy of prosody achieved for Thai, Mandarin and English.
• Many-to-one mapping from contextually variable F0 to invariant functional targets.
• Effectively handling of both contextual and non-contextual variability.
• Large-scale and full-detailed prosody synthesis as tool for theory testing.
• Freely available as a Praat scripts and plug-ins to the speech science community.

Variability has been one of the major challenges for both theoretical understanding and computer synthesis of speech prosody. In this paper we show that economical representation of variability is the key to effective modeling of prosody. Specifically, we report the development of PENTAtrainer—A trainable yet deterministic prosody synthesizer based on an articulatory–functional view of speech. We show with testing results on Thai, Mandarin and English that it is possible to achieve high-accuracy predictive synthesis of fundamental frequency contours with very small sets of parameters obtained through stochastic learning from real speech data. The first key component of this system is syllable-synchronized sequential target approximation—implemented as the qTA model, which is designed to simulate, for each tonal unit, a wide range of contextual variability with a single invariant target. The second key component is the automatic learning of function-specific targets through stochastic global optimization, guided by a layered pseudo-hierarchical functional annotation scheme, which requires the manual labeling of only the temporal domains of the functional units. The results in terms of synthesis accuracy demonstrate that effective modeling of the contextual variability is the key also to effective modeling of function-related variability. Additionally, we show that, being both theory-based and trainable (hence data-driven), computational systems like PENTAtrainer can serve as an effective modeling tool in basic research, with which the level of falsifiability in theory testing can be raised, and also a closer link between basic and applied research in speech science can be developed.

Figure optionsDownload as PowerPoint slide

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 57, February 2014, Pages 181–208

نویسندگان

Yi Xu, Santitham Prom-on,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : به سوی بازنمودهای کاربردی غیرمستقیم از سطوح فرکانس اساسی فرکانس متغیر: سنتز ملودی گفتاری با استفاده از یادگیری تصادفی مبتنی بر مدل

دسترسی سریع

ارتباط

English Website