Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
565329	875732	2011	13 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

HMM-based speech synthesis Voice conversion - تبدیل صدا Hidden Markov Model (HMM) - مدل مارکف مخفی (HMM)

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش صفحه اول مقاله

Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

چکیده انگلیسی

This paper describes a speaker-independent HMM-based voice conversion technique that incorporates context-dependent prosodic symbols obtained using adaptive quantization of the fundamental frequency (F0). In the HMM-based conversion of our previous study, the input utterance of a source speaker is decoded into phonetic and prosodic symbol sequences, and the converted speech is generated using the decoded information from the pre-trained target speaker’s phonetically and prosodically context-dependent HMM. In our previous work, we generated the F0 symbol by quantizing the average log F0 value of each phone using the global mean and variance calculated from the training data. In the current study, these statistical parameters are obtained from each utterance itself, and this adaptive method improves the F0 conversion performance of the conventional one. We also introduce a speaker-independent model for decoding the input speech and model adaptation for training the target speaker’s model in order to reduce the required amount of training data under a condition where the phonetic transcription is available for the input speech. Objective and subjective experimental results for Japanese speech demonstrate that the adaptive quantization method gives better F0 conversion performance than the conventional one. Moreover, our technique with only ten sentences of the target speaker’s adaptation data outperforms the conventional GMM-based one using parallel data of 200 sentences.

► We propose an SI-HMM-based voice conversion using adaptive F0 quantization.
► Adaptive F0 quantization improved F0 conversion performance.
► SI-HMM-based conversion needs no training data of the source speaker.
► We can also significantly reduce the amount of the target speaker’s training data.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 53, Issue 7, September 2011, Pages 973–985

نویسندگان

Takashi Nose, Takao Kobayashi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

دسترسی سریع

ارتباط

English Website