Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
484876 | Procedia Computer Science | 2015 | 5 Pages |
Abstract
Speech undergoes various acoustic interferences in natural environment, while many of the applications require an effective way to separate the dominant signal from the interference. In this paper, a Short-time Fourier Transform (STFT) based unsupervised method for single channel speech separation is proposed. It uses the pitch information of the dominant and interfering speakers and then generating a time frequency mask based on the pitch frequencies. Through rigorous objective and subjective evaluations, it is shown that the proposed system is capable of providing better Signal to Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ) compared to other related methods available in the literature.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)