Comparing ANN and GMM in a voice conversion framework

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
496412	862859	2012	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

artificial neural networks - شبکه های عصبی مصنوعی Gaussian mixture models - مدل مخلوط گاوسی Energy profiles - پروفایل های انرژی Prosody - پرونده Pitch contour - کانتینر

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر

پیش نمایش صفحه اول مقاله

Comparing ANN and GMM in a voice conversion framework

چکیده انگلیسی

In this paper, we present a comparative analysis of artificial neural networks (ANNs) and Gaussian mixture models (GMMs) for design of voice conversion system using line spectral frequencies (LSFs) as feature vectors. Both the ANN and GMM based models are explored to capture nonlinear mapping functions for modifying the vocal tract characteristics of a source speaker according to a desired target speaker. The LSFs are used to represent the vocal tract transfer function of a particular speaker. Mapping of the intonation patterns (pitch contour) is carried out using a codebook based model at segmental level. The energy profile of the signal is modified using a fixed scaling factor defined between the source and target speakers at the segmental level. Two different methods for residual modification such as residual copying and residual selection methods are used to generate the target residual signal. The performance of ANN and GMM based voice conversion (VC) system are conducted using subjective and objective measures. The results indicate that the proposed ANN-based model using LSFs feature set may be used as an alternative to state-of-the-art GMM-based models used to design a voice conversion system.

The desired spectral envelope along with the predicted spectral envelopes using baseline Gaussian mixture model (GMM) and proposed 5-layer feed forward neural network for mapping the vocal tract characteristics of a source speaker according to a desired target speaker for voice conversion. Figure optionsDownload as PowerPoint slideHighlights
► Database of four Indian speakers has been developed to design voice conversion system.
► 5-Layer ANN-based model has been proposed for vocal tract modification in a voice conversion framework.
► Spectral mapping using ANN perform equally well as that of the conventional GMM based model.
► Epoch based codebook model for pitch contour modification can capture the patterns of the desired target pitch contour.
► LP residual selection and LP residual copying methods are compared in terms speaker's identity and quality conversion.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 12, Issue 11, November 2012, Pages 3332–3342

نویسندگان

R.H. Laskar, D. Chakrabarty, F.A. Talukdar, K. Sreenivasa Rao, K. Banerjee,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Comparing ANN and GMM in a voice conversion framework

دسترسی سریع

ارتباط

English Website