Two-stage phone duration modelling with feature construction and feature vector extension for the needs of speech synthesis

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
558471	874934	2012	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Feature construction - ساختار ویژگی Text-to-speech synthesis - سنتز متن به گفتار Statistical modelling - مدلسازی آماری

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Two-stage phone duration modelling with feature construction and feature vector extension for the needs of speech synthesis

چکیده انگلیسی

We propose a two-stage phone duration modelling scheme, which can be applied for the improvement of prosody modelling in speech synthesis systems. This scheme builds on a number of independent feature constructors (FCs) employed in the first stage, and a phone duration model (PDM) which operates on an extended feature vector in the second stage. The feature vector, which acts as input to the first stage, consists of numerical and non-numerical linguistic features extracted from text. The extended feature vector is obtained by appending the phone duration predictions estimated by the FCs to the initial feature vector. Experiments on the American-English KED TIMIT and on the Modern Greek WCL-1 databases validated the advantage of the proposed two-stage scheme, improving prediction accuracy over the best individual predictor, and over a two-stage scheme which just fuses the first-stage outputs. Specifically, when compared to the best individual predictor, a relative reduction in the mean absolute error and the root mean square error of 3.9% and 3.9% on the KED TIMIT, and of 4.8% and 4.6% on the WCL-1 database, respectively, is observed.

► Phone duration models used as feature constructors creating new features.
► The new features are appended to the initial feature vector.
► Extended feature vector training phone duration model in the second stage.
► Second-stage phone duration model outperforms the best baseline model.
► Improving accuracy with the proposed two-stage scheme with support vector regression.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 26, Issue 4, August 2012, Pages 274–292

نویسندگان

Alexandros Lazaridis, Todor Ganchev, Iosif Mporas, Evaggelos Dermatas, Nikos Fakotakis,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Two-stage phone duration modelling with feature construction and feature vector extension for the needs of speech synthesis

دسترسی سریع

ارتباط

English Website