Article ID Journal Published Year Pages File Type
565324 Speech Communication 2011 11 Pages PDF
Abstract

Cepstral normalisation in automatic speech recognition is investigated in the context of robustness to additive noise. In this paper, it is argued that such normalisation leads naturally to a speech feature based on signal to noise ratio rather than absolute energy (or power). Explicit calculation of this SNR-cepstrum by means of a noise estimate is shown to have theoretical and practical advantages over the usual (energy based) cepstrum. The relationship between the SNR-cepstrum and the articulation index, known in psycho-acoustics, is discussed. Experiments are presented suggesting that the combination of the SNR-cepstrum with the well known perceptual linear prediction method can be beneficial in noisy environments.

► Cepstral normalisation is shown to be equivalent to using the SNR-spectrum and SNR-cepstrum. ►Calculation of the SNR-spectrum directly, rather than relying on CMN to do it, is beneficial. ► The SNR-cepstrum is closely related to the articulation index known in psycho-acoustics.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
,