A permutation algorithm based on dynamic time warping in speech frequency-domain blind source separation

Article ID	Journal	Published Year	Pages	File Type
4977786	Speech Communication	2017	10 Pages	PDF

Abstract

Frequency-Domain Blind Source Separation (FD-BSS) is an efficient way to analyze convolutive mixed speech. To improve the quality of the separated speech, a permutation algorithm based on Dynamic Time Warping (DTW) is proposed in this paper. Because signals in adjacent frequency bins have high similarity, DTW technology is used to compare them and generate adjustment matrices to solve the permutation ambiguity. Our approach is evaluated through simulated and practical experiments. Using Signal to Distortion Ratio (SDR), Signal to Interference Ratio (SIR), Signal to Artifacts Ratio (SAR), and Perceptual Estimation of the Speech Quality (PESQ) for measurements. To examine the quality of the separated speech in a practical acoustic environment, we adopt the accuracy ratio of Automatic Speech Recognition (ASR). In the experiments, we compare our approach with other classical permutation criteria such as K-L divergence distance, envelope correlation and higher-order statistics. The experimental results show that the proposed algorithm performs permutation alignment more accurately and improves the acoustic quality of separation.

Keywords

ICA, Independent component analysis Permutation Dynamic Time Warping (DTW)Blind Source Separation (BSS)