Compressive speech enhancement

Article ID	Journal	Published Year	Pages	File Type
567168	Speech Communication	2013	12 Pages	PDF

Abstract

•Proposed a new method, which exploits the sparsity assumption in com-pressed sensing to perform speech enhancement.•Compressive speech enhancement recovers only the sparse components which in this case is speech.•Theoretical derivations show improvement in terms of signal to noise ratio.•No voice activity detector or a noise statistics estimator is needed.

This paper presents an alternative approach to speech enhancement by using compressed sensing (CS). CS is a new sampling theory, which states that sparse signals can be reconstructed from far fewer measurements than the Nyquist sampling. As such, CS can be exploited to reconstruct only the sparse components (e.g., speech) from the mixture of sparse and non-sparse components (e.g., noise). This is possible because in a time-frequency representation, speech signal is sparse whilst most noise is non-sparse. Derivation shows that on average the signal to noise ratio (SNR) in the compressed domain is greater or equal than the uncompressed domain. Experimental results concur with the derivation and the proposed CS scheme achieves better or similar perceptual evaluation of speech quality (PESQ) scores and segmental SNR compared to other conventional methods in a wide range of input SNR.

Keywords

Sparsity speech enhancement Compressed sensing