کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
564738 1451751 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique
ترجمه فارسی عنوان
استخراج ویژگی های قوی بر اساس یک فیلتربیت شنوایی وابسته به سطح متقارن و تکنیک افزایش طیف زیر باند
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
چکیده انگلیسی


• We propose cGCFB-based robust cepstral feature (RCGCC) for speech recognition tasks.
• A sigmoid-shaped suppression rule is introduced for auditory spectrum enhancement.
• Short-term cepstral mean and scale normalization is proposed to reduce mismatch.
• Performance evaluation is carried out on the AURORA-2, -4 and -5 corpora.
• Proposed RCGCC outperformed other front-ends in real-time reverberant environment.

In this paper we introduce a robust feature extractor, dubbed as robust compressive gammachirp filterbank cepstral coefficients (RCGCC), based on an asymmetric and level-dependent compressive gammachirp filterbank and a sigmoid shape weighting rule for the enhancement of speech spectra in the auditory domain. The goal of this work is to improve the robustness of speech recognition systems in additive noise and real-time reverberant environments. As a post processing scheme we employ a short-time feature normalization technique called short-time cepstral mean and scale normalization (STCMSN), which, by adjusting the scale and mean of cepstral features, reduces the difference of cepstra between the training and test environments. For performance evaluation, in the context of speech recognition, of the proposed feature extractor we use the standard noisy AURORA-2 connected digit corpus, the meeting recorder digits (MRDs) subset of the AURORA-5 corpus, and the AURORA-4 LVCSR corpus, which represent additive noise, reverberant acoustic conditions and additive noise as well as different microphone channel conditions, respectively. The ETSI advanced front-end (ETSI-AFE), the recently proposed power normalized cepstral coefficients (PNCC), conventional MFCC and PLP features are used for comparison purposes. Experimental speech recognition results demonstrate that the proposed method is robust against both additive and reverberant environments. The proposed method provides comparable results to that of the ETSI-AFE and PNCC on the AURORA-2 as well as AURORA-4 corpora and provides considerable improvements with respect to the other feature extractors on the AURORA-5 corpus.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Digital Signal Processing - Volume 29, June 2014, Pages 147–157
نویسندگان
, , ,