کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
566667 1452019 2016 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improved chirp group delay based algorithms with applications to vocal tract estimation and speech recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Improved chirp group delay based algorithms with applications to vocal tract estimation and speech recognition
چکیده انگلیسی


• We pointed out important drawbacks in earlier works in chirp group delay processing.
• RRCGD method eliminates above shortcomings while retaining the CGDGCIs advantages.
• RRCGD-ICA gives good results for high pitch signals.
• Proposed 3-D nasal feature vector discriminates /m/, /n/, and /ng/ well.
• Comparative ASR results are given for magnitude and phase-based features.

In this paper we propose two algorithms for estimating the vocal tract from the Fourier transform phase of a given speech segment. In the first approach, we find the zeros of the z-transform, reflect all outside-unit-circle zeros inside, and then compute the chirp group delay spectrum. This method eliminates many of the drawbacks in Bozkurt’s CGDGCI method, and is able to model well the spectral valleys present. In the case of high pitch sounds, the vocal tract estimate in the proposed method is corrupted by source oscillations. In the second approach, by casting the problem within the framework of Independent Component Analysis, we propose a method wherein these effects are considerably suppressed. ASR results on the TIMIT database using features derived from the first method are comparable to those obtained using MFCC features. Further improvement in the recognition accuracy (compared with the baseline MFCC) was obtained by using lattice combining technique, resulting in a Phone Error Rate of 17%. Also, by using our abilities to model spectral valleys well, we propose additional features that are able to distinguish the nasals /m/, /n/, and /ng/, which in turn lead to an increase in their recognition accuracy.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 81, July 2016, Pages 72–89
نویسندگان
, ,