کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4973785 1451709 2017 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improved voicing decision using glottal activity features for statistical parametric speech synthesis
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Improved voicing decision using glottal activity features for statistical parametric speech synthesis
چکیده انگلیسی
A method to improve voicing decision using glottal activity features proposed for statistical parametric speech synthesis. In existing methods, voicing decision relies mostly on fundamental frequency F0, which may result in errors when the prediction is inaccurate. Even though F0 is a glottal activity feature, other features that characterize this activity may help in improving the voicing decision. The glottal activity features used in this work are the strength of excitation (SoE), normalized autocorrelation peak strength (NAPS), and higher-order statistics (HOS). These features obtained from approximated source signals like zero-frequency filtered signal and integrated linear prediction residual. To improve voicing decision and to avoid heuristic threshold for classification, glottal activity features are trained using different statistical learning methods such as the k-nearest neighbor, support vector machine (SVM), and deep belief network. The voicing decision works best with SVM classifier, and its effectiveness is tested using the statistical parametric speech synthesis. The glottal activity features SoE, NAPS, and HOS modeled along with F0 and Mel-cepstral coefficients in Hidden Markov model and deep neural network to get the voicing decision. The objective and subjective evaluations demonstrate that the proposed method improves the naturalness of synthetic speech.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Digital Signal Processing - Volume 71, December 2017, Pages 131-143
نویسندگان
, , ,