Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6961146 | Speech Communication | 2015 | 10 Pages |
Abstract
This paper addresses the problem of speech enhancement employing the Minimum Mean-Square Error (MMSE) of β-order Short Time Spectral Amplitude (STSA). The motivation has been to take advantages of both Laplacian speech modeling and β-order cost function in MMSE estimation of clean speech. We present an analytical solution for β-order MMSE STSA estimator assuming Laplacian prior for the real and imaginary parts of the Discrete Fourier Transform (DFT) coefficients of (clean) speech. We also assume Gaussian distribution for the real and imaginary parts of the DFT coefficients of the noise. The analytical solution, named β-order LapMMSE, does not have a closed form and is highly non-linear and computationally complex. Using some approximations for the joint probability density function and the Bessel function, we also present an improved closed-form version of the estimator (called β-order ImpLapMMSE). The value of β is adapted as a function of frame Signal to Noise Ratio (SNR). We have compared the performance of the proposed estimator with the state-of-the-art estimators that assume either Gaussian or Laplacian probability density functions for the real and imaginary parts of the DFT coefficients of clean speech. To this end, the input noisy signal and the outputs of MMSE STSA, β-order STSA, and ImpLapMMSE estimators have been compared with the output of the proposed estimator. Our comparative evaluations in terms of Segmental SNR (SegSNR), Perceptual Evaluation of Speech Quality (PESQ), and Log-Likelihood Ratio (LLR) distance demonstrate the superior performance of the proposed β-order ImpLapMMSE estimator.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Signal Processing
Authors
Hamid Reza Abutalebi, Mehdi Rashidinejad,