Article ID Journal Published Year Pages File Type
567095 Speech Communication 2013 19 Pages PDF
Abstract

•We propose a method for modulation domain noise robust blind speech separation.•We use modulation domain complex spectral subtraction to preprocess speech mixtures.•We use asymmetric Laplacian mixture modeling to estimate source DOAs.•We use modulation domain separation to increase the source sparsity.•Our method showed superior performances in PESQ, segmental SDR, and SIR gain.

We propose a noise robust blind speech separation (BSS) method by using two microphones. We perform BSS in the modulation domain to take advantage of the improved signal sparsity and reduced musical tone noise in this domain over the conventional acoustic frequency domain processing. We first use modulation domain real and imaginary spectral subtraction (MRISS) to enhance both magnitude and phase spectra of the noisy speech mixture inputs. We then estimate the direction of arrivals (DOAs) of the speech sources from subband inter-sensor phase differences (IPDs) by using an asymmetric Laplacian mixture model (ALMM), cluster the full-band IPDs via the estimated DOAs, and perform time–frequency masking to separate the source signals, all in the modulation domain. Experimental evaluations in five types of noises have shown that the performance of the proposed method is robust in 0–10 dB SNRs and it is superior to acoustic domain separation without MRISS.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, ,