A novel framework for noise robust ASR using cochlear implant-like spectrally reduced speech

Article ID	Journal	Published Year	Pages	File Type
567514	Speech Communication	2012	15 Pages	PDF

Abstract

We propose a novel framework for noise robust automatic speech recognition (ASR) based on cochlear implant-like spectrally reduced speech (SRS). Two experimental protocols (EPs) are proposed in order to clarify the advantage of using SRS for noise robust ASR. These two EPs assess the SRS in both the training and testing environments. Speech enhancement was used in one of two EPs to improve the quality of testing speech. In training, SRS is synthesized from original clean speech whereas in testing, SRS is synthesized directly from noisy speech or from enhanced speech signals. The synthesized SRS is recognized with the ASR systems trained on SRS signals, with the same synthesis parameters. Experiments show that the ASR results, in terms of word accuracy, calculated with ASR systems using SRS, are significantly improved compared to the baseline non-SRS ASR systems. We propose also a measure of the training and testing mismatch based on the Kullback–Leibler divergence. The numerical results show that using the SRS in ASR systems helps in reducing significantly the training and testing mismatch due to environmental noise. The training of the HMM-based ASR systems and the recognition tests were performed by using the HTK toolkit and the Aurora 2 speech database.

► ASR systems, based on cochlear implant-like SRS, gain noise robustness. ► SRS, synthesized from clean and noisy speech, is used in train and test, respectively. ► Train/test mismatch could be measured via Kullback–Leibler divergence (KLD). ► Using SRS in ASR helps in reducing significantly train/test mismatch, measured by KLD. ► Experiments are performed by using Aurora 2 database and HTK toolkit.

Keywords

Cochlear implant Kullback?Leibler divergence