Spectral and temporal manipulations of SFF envelopes for enhancement of speech intelligibility in noise

Article ID	Journal	Published Year	Pages	File Type
11028026	Computer Speech & Language	2019	18 Pages	PDF

Abstract

This paper presents a method for modifying speech to enhance its intelligibility in noise. The features contributing to intelligibility are analyzed using the recently proposed single frequency filtering (SFF) analysis of speech signals. In the SFF method, the spectral and temporal resolutions can be controlled using a single parameter of the filter, corresponding to the location of the pole on the negative real axis with respect to the unit circle in the z-plane. The SFF magnitude (envelope) and phase at several frequencies can be used to synthesize the original speech signal. Analysis of highly intelligible speech shows that the speech signal is more intelligible when it has higher dynamic range of amplitude locally (fine structure) and/or lower dynamic range of amplitude globally (gross structure) in both the spectral and temporal domains. Some features of normal speech are modified at fine and gross temporal and spectral levels, and the modified SFF envelopes are used to synthesize speech. The proposed method gives higher objective scores of intelligibility compared to original and the reference method (spectral shaping and dynamic range compression), under different conditions of noise. In subjective evaluation, though the word accuracies are not significantly different between the proposed and reference methods, listeners seem to prefer the proposed method as it gives louder and crisper sound.

Keywords

Speech intelligibility