کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
568653 1452040 2014 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data
ترجمه فارسی عنوان
تبدیل صدای بر اساس فرآیندهای گاوسی توسط آموزش های یکپارچه و نامتقارن با داده های آموزشی محدود؟
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• Full implementation details of the GP-based method are described in depth.
• Coherent training that combines excitation and vocal tract is proposed.
• Asymmetric training is proposed for increasing the accuracy without additional costs.

Voice conversion (VC) is a technique aiming to mapping the individuality of a source speaker to that of a target speaker, wherein Gaussian mixture model (GMM) based methods are evidently prevalent. Despite their wide use, two major problems remains to be resolved, i.e., over-smoothing and over-fitting. The latter one arises naturally when the structure of model is too complicated given limited amount of training data.Recently, a new voice conversion method based on Gaussian processes (GPs) was proposed, whose nonparametric nature ensures that the over-fitting problem can be alleviated significantly. Meanwhile, it is flexible to perform non-linear mapping under the framework of GPs by introducing sophisticated kernel functions. Thus this kind of method deserves to be explored thoroughly in this paper. To further improve the performance of the GP-based method, a strategy for mapping prosodic and spectral features coherently is adopted, making the best use of the intercorrelations embedded among both excitation and vocal tract features. Moreover, the accuracy in computing the kernel functions of GP can be improved by resorting to an asymmetric training strategy that allows the dimensionality of input vectors being reasonably higher than that of the output vectors without additional computational costs. Experiments have been conducted to confirm the effectiveness of the proposed method both objectively and subjectively, which have demonstrated that improvements can be obtained by GP-based method compared to the traditional GMM-based approach.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 58, March 2014, Pages 124–138
نویسندگان
, , , , , ,