کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
1101195 953538 2007 20 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients
موضوعات مرتبط
علوم انسانی و اجتماعی علوم انسانی و هنر زبان و زبان شناسی
پیش نمایش صفحه اول مقاله
Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients
چکیده انگلیسی

This paper addresses the following two hypotheses: (i) vocal-tract area functions of Japanese vowels can be accurately represented by a linear combination of only a few principal components which, furthermore, are similar to those reported in the literature for different languages; and (ii) the principal components’ weights can be predicted and area functions thereby accurately estimated from acoustics by linear regression of cepstrum parameters. To test these hypotheses, synchronized acoustic and vocal-tract 3D MRI data were recorded from an adult male Japanese speaker for both sustained and dynamic vowel utterances. The first two principal components explained covariations in vocal-tract shape and length accounting for 94–97% of the total variance, and indeed provided a cross-linguistic validation of the two underlying components of vowel production emergent from the literature. Multiple linear regression models were then evaluated for their accuracy in reconstructing the area functions of the dynamic utterance by predicting the first two PC coefficients, using either carefully measured formants or cepstral coefficients defined in various frequency bands. The best formant-based regression model required all four formants, with a mean adjusted correlation of 0.93 and mean absolute errors of 0.187 cm2 in area and 0.131 cm in vocal-tract length. The best cepstrum-based regression model prescribed 24 cepstral coefficients defined in the frequency band 0–4 kHz, with a mean adjusted correlation of 0.92 and mean absolute errors of 0.102 cm2 in area and 0.082 cm in vocal-tract length. These results suggest that vowel production features, properly constrained by PCA modeling, can be mapped with sufficient accuracy from easily measured cepstrum parameters. More work is required to reduce the dependence on MRI data, to extend the applicability of these methods to different voice qualities and different speakers, and to select a smaller subset of acoustic parameters for more robust, real-time inversion.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Phonetics - Volume 35, Issue 1, January 2007, Pages 20–39
نویسندگان
, , , ,