کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
565954 | 875876 | 2012 | 22 صفحه PDF | دانلود رایگان |
This work proposes using Wavelet-Packet Cepstral coefficients (WPPCs) as an alternative way to do filter-bank energy-based feature extraction (FE) for automatic speech recognition (ASR). The rich coverage of time-frequency properties of Wavelet Packets (WPs) is used to obtain new sets of acoustic features, in which competitive and better performances are obtained with respect to the widely adopted Mel-Frequency Cepstral coefficients (MFCCs) in the TIMIT corpus. In the analysis, concrete filter-bank design considerations are stipulated to obtain most of the phone-discriminating information embedded in the speech signal, where the filter-bank frequency selectivity, and better discrimination in the lower frequency range [200 Hz–1 kHz] of the acoustic spectrum are important aspects to consider.
► The Wavelet-Packet Cepstral coefficient is presented as a new front-end for ASR.
► ASR performances are obtained in the TIMIT corpus.
► Filter-bank solutions are derived across different frequencies selectivity.
► Obtained solutions outperform the MFFCs in a number of settings.
► Proposed discriminative filter-bank frequency partitions match the MEL scale.
Journal: Speech Communication - Volume 54, Issue 6, July 2012, Pages 814–835