Discriminative training and explicit duration modeling for HMM-based automatic segmentation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
10370829	876534	2005	14 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Discriminative training - آموزش تبعیض آمیز Automatic segmentation - تقسیم بندی خودکار Speech synthesis - سنتز گفتار

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Discriminative training and explicit duration modeling for HMM-based automatic segmentation

چکیده انگلیسی

HMM-based automatic segmentation has been popularly used for corpus construction for concatenative speech synthesis. Since the most important reasons for the inaccuracy of HMM-based automatic segmentation are the HMM training criterion and duration control, we will study these particular issues. For the HMM training, we apply the discriminative training method and introduce a new criterion, named Minimum SeGmentation Error (MSGE). In this method, a loss function directly related to the segmentation error is defined, and parameter optimization is performed by the Generalized Probabilistic Descent (GPD) algorithm. For the duration control problem, we apply explicit duration models and propose a two-step-based segmentation method to solve the problem of computational cost, where the duration model is incorporated in a postprocessor procedure. From the experimental results, these two techniques significantly improve segmentation accuracy with different focuses, where the MSGE-based discriminative training focuses on improving the accuracy of sensitive boundary, i.e., a boundary where an error in segmentation is likely to cause a noticeable degradation in speech synthesis quality, and the explicit duration modeling focuses on eliminating large errors. After combining these two techniques, the error average was reduced from 6.86Â ms to 5.79Â ms on Japanese data, and from 8.67Â ms to 6.61Â ms on Chinese data. Simultaneously, the number of errors larger than 30Â ms were reduced 25% and 51% on Chinese and Japanese data, respectively.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 47, Issue 4, December 2005, Pages 397-410

نویسندگان

Yi-Jian Wu, Hisashi Kawai, Jinfu Ni, Ren-Hua Wang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Discriminative training and explicit duration modeling for HMM-based automatic segmentation

دسترسی سریع

ارتباط

English Website