Model-based estimation of late reverberant spectral variance using modified weighted prediction error method

Article ID	Journal	Published Year	Pages	File Type
4977783	Speech Communication	2017	19 Pages	PDF

Abstract

In this paper, we propose a new approach to estimate the late reverberant spectral variance (LRSV) for speech dereverberation in the short-time Fourier transform (STFT) domain. Our approach uses a model-based scheme involving the estimation of a smoothing (shape) parameter and the reverberant-only component of speech. We propose to obtain the shape parameter by using estimates of the spectral variances of the direct-path and reverberant-only components of the speech, which in turn, can be calculated by smoothing coarse estimates of these two components. Furthermore, an accurate estimate of the reverberant-only component is obtained by means of a moving average scheme. In order to obtain the preliminary estimates of the direct-path and reverberant speech components, we employ a modified version of the weighted prediction error (WPE) method. In contrast to the original WPE method, the suggested modification is implemented for shorter processing blocks, each consisting of a number of STFT frames. This block-wise procedure allows for adaptation to moderate changes in environment and makes the proposed approach also suitable for time-varying acoustic scenarios. Performance evaluations with respect to previous LRSV estimation methods demonstrate the superiority of the proposed approach in both time-invariant and time-variant reverberant environments.

Keywords

(CD)(MMSE)Room acoustics Expectation-Maximization Short-time Fourier transform Blind channel identification Moving average Signal-to-noise ratio Room impulse response Short-Time Fourier Transform (STFT)