Article ID Journal Published Year Pages File Type
567492 Speech Communication 2012 13 Pages PDF
Abstract

In this paper we propose a novel objective method for intelligibility prediction of enhanced speech which is based on the negative distortion ratio (NDR) – that is, the amount of power spectra that has been removed in comparison to the original clean speech signal, likely due to a bad noise estimate during the speech enhancement procedure. While negative spectral distortions can have a significant importance in subjective intelligibility assessment of processed speech, most of the objective measures in the literature do not well account for this type of distortion. The proposed method focuses on a very specific type of noise, so it is not intended to be used alone but in combination with other techniques, to jointly achieve a better intelligibility prediction. In order to find an appropriate technique to be combined with, in this paper we also review a number of recently proposed methods based on correlation and coherence measures. These methods have already shown a high correlation with human recognition scores, as they effectively detect the presence of nonlinearities, frequently found in noise-suppressed speech. However, when these techniques are jointly applied with the proposed method, significantly higher correlations (above r = 0.9) are shown to be achieved.

► Negative spectral distortions as indication of loss of speech information. ► Speech information loss reduces intelligibility. ► Negative distortion ratio as distance based intelligibility measure. ► Combined distance and correlation based measures to improve intelligibility score. ► Objective intelligibility evaluation for enhanced speech.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,