Speech enhancement based on Î²-order MMSE estimation of Short Time Spectral Amplitude and Laplacian speech modeling

Article ID	Journal	Published Year	Pages	File Type
6961146	Speech Communication	2015	10 Pages	PDF

Abstract

This paper addresses the problem of speech enhancement employing the Minimum Mean-Square Error (MMSE) of Î²-order Short Time Spectral Amplitude (STSA). The motivation has been to take advantages of both Laplacian speech modeling and Î²-order cost function in MMSE estimation of clean speech. We present an analytical solution for Î²-order MMSE STSA estimator assuming Laplacian prior for the real and imaginary parts of the Discrete Fourier Transform (DFT) coefficients of (clean) speech. We also assume Gaussian distribution for the real and imaginary parts of the DFT coefficients of the noise. The analytical solution, named Î²-order LapMMSE, does not have a closed form and is highly non-linear and computationally complex. Using some approximations for the joint probability density function and the Bessel function, we also present an improved closed-form version of the estimator (called Î²-order ImpLapMMSE). The value of Î² is adapted as a function of frame Signal to Noise Ratio (SNR). We have compared the performance of the proposed estimator with the state-of-the-art estimators that assume either Gaussian or Laplacian probability density functions for the real and imaginary parts of the DFT coefficients of clean speech. To this end, the input noisy signal and the outputs of MMSE STSA, Î²-order STSA, and ImpLapMMSE estimators have been compared with the output of the proposed estimator. Our comparative evaluations in terms of Segmental SNR (SegSNR), Perceptual Evaluation of Speech Quality (PESQ), and Log-Likelihood Ratio (LLR) distance demonstrate the superior performance of the proposed Î²-order ImpLapMMSE estimator.

Keywords

speech enhancement