کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
568603 1452031 2015 20 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Sub-band based histogram equalization in cepstral domain for speech recognition
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال
پیش نمایش صفحه اول مقاله
Sub-band based histogram equalization in cepstral domain for speech recognition
چکیده انگلیسی


• Proposed a novel extension to Histogram Equalization method for noise compensation.
• We perform a sub-band specific equalization on the noisy cepstral features.
• Histogram analysis and recognition results show usefulness of the proposed approach.
• Favorable in real-time systems due to superior performance and computational benifits.

This paper describes a novel framework to sub-band based Histogram Equalization (HEQ) applied to robust speech recognition. We propose a frequency band specific equalization to compensate the noise distortion on the individual frequency bands. The proposed equalization framework is a two step process. In the first step, conventional histogram equalization is done. By analyzing the histograms of equalized cepstra, we show that the first stage of conventional HEQ approach does not compensate the sub-band specific noise distortion, even though the overall histogram is normalized. Hence, in the second stage, sub-band specific histogram equalization is done. Every frame of cepstral coefficients is decomposed into low-frequency (LF) cepstra and high-frequency (HF) cepstra. Separate equalization is done on LF and HF cepstra to compensate LF and HF specific noise distortion. The cepstra corresponding to the LF and HF bands are obtained by using simple averaging and differencing filters on the cepstral components within   a particular frame. The proposed approach is referred to as Sub-band Histogram Equalization (S-HEQ). Using histogram analysis, we show that the S-HEQ approach is able to compensate for the sub-band specific noise distortion. S-HEQ approach shows a consistent improvement over the conventional HEQ approach with a relative improvement of 12%12% and 22.10%22.10% over conventional HEQ in WER on Aurora-2 and Aurora-4 databases respectively. Proposed equalization approach can also be used with the deep neural network based systems and has shown a consistent improvement in the recognition accuracies over conventional HEQ. Finally, the efficacy of the proposed S-HEQ approach for embedded real-time speech applications is shown by comparing the performance and computational complexity trade-off with other state-of-the-art noise compensation methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Speech Communication - Volume 69, May 2015, Pages 46–65
نویسندگان
, , , , ,