Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
11028026 | Computer Speech & Language | 2019 | 18 Pages |
Abstract
This paper presents a method for modifying speech to enhance its intelligibility in noise. The features contributing to intelligibility are analyzed using the recently proposed single frequency filtering (SFF) analysis of speech signals. In the SFF method, the spectral and temporal resolutions can be controlled using a single parameter of the filter, corresponding to the location of the pole on the negative real axis with respect to the unit circle in the z-plane. The SFF magnitude (envelope) and phase at several frequencies can be used to synthesize the original speech signal. Analysis of highly intelligible speech shows that the speech signal is more intelligible when it has higher dynamic range of amplitude locally (fine structure) and/or lower dynamic range of amplitude globally (gross structure) in both the spectral and temporal domains. Some features of normal speech are modified at fine and gross temporal and spectral levels, and the modified SFF envelopes are used to synthesize speech. The proposed method gives higher objective scores of intelligibility compared to original and the reference method (spectral shaping and dynamic range compression), under different conditions of noise. In subjective evaluation, though the word accuracies are not significantly different between the proposed and reference methods, listeners seem to prefer the proposed method as it gives louder and crisper sound.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Signal Processing
Authors
Nivedita Chennupati, Sudarsana Reddy Kadiri, Yegnanarayana B.,