کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
565324 | 875726 | 2011 | 11 صفحه PDF | دانلود رایگان |
Cepstral normalisation in automatic speech recognition is investigated in the context of robustness to additive noise. In this paper, it is argued that such normalisation leads naturally to a speech feature based on signal to noise ratio rather than absolute energy (or power). Explicit calculation of this SNR-cepstrum by means of a noise estimate is shown to have theoretical and practical advantages over the usual (energy based) cepstrum. The relationship between the SNR-cepstrum and the articulation index, known in psycho-acoustics, is discussed. Experiments are presented suggesting that the combination of the SNR-cepstrum with the well known perceptual linear prediction method can be beneficial in noisy environments.
► Cepstral normalisation is shown to be equivalent to using the SNR-spectrum and SNR-cepstrum.
► Calculation of the SNR-spectrum directly, rather than relying on CMN to do it, is beneficial.
► The SNR-cepstrum is closely related to the articulation index known in psycho-acoustics.
Journal: Speech Communication - Volume 53, Issue 8, October 2011, Pages 991–1001