کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
567501 876090 2012 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A new representation for speech frame recognition based on redundant wavelet filter banks
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
A new representation for speech frame recognition based on redundant wavelet filter banks
چکیده انگلیسی

Although the conventional wavelet transform possesses multi-resolution properties, it is not optimized for speech recognition systems. It suffers from lower performance compared with Mel Frequency Cepstral Coefficients (MFCCs) in which Mel scale is based on human auditory perception. In this paper, some new speech representations based on redundant wavelet filter-banks (RWFB) are proposed. RWFB parameters are much less shift-sensitive than those of critically sampled discrete wavelet transform (DWT), so they seem to feature better performance in speech recognition tasks because of having better time-frequency localization ability. However, the improvement is at the expense of higher redundancy. In this paper, some types of wavelet representations are introduced, including a combination of critically sampled DWT and some different multi-channel redundant filter-banks down-sampled by 2. In order to find appropriate filter values for multi-channel filter-banks, effects of changing the zero moments of proposed wavelet are discussed. The corresponding method performances are compared in a phoneme recognition task using time delay neural networks. It is revealed that redundant multi-channel wavelet filter-banks work better than conventional DWT in speech recognition systems. The proposed four-channel higher density discrete wavelet filter-bank results in up to approximately 8.95% recognition rate increase, compared with critically sampled two-channel wavelet filter-bank.


► All Mel filters all band-pass.
► Critically sampled filter banks are shift variant.
► Critically sampled filter-banks are not suitable for speech representations.
► Multi-channel redundant wavelet filter-banks lead to better results.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 54, Issue 2, February 2012, Pages 256–271
نویسندگان
, , , ,