Time difference of arrival estimation of speech source in a noisy and reverberant environment

Article ID	Journal	Published Year	Pages	File Type
10370001	Signal Processing	2005	28 Pages	PDF

Abstract

Determining the spatial position of a speaker finds a growing interest in video conference scenarios where automated camera steering and tracking are required. Speaker localization can be achieved with a dual-step approach. In the preliminary stage a microphone array is used to extract the time difference of arrival (TDOA) of the speech signal. These readings are then used by the second stage for the actual localization. In this work we present novel, frequency domain, approaches for TDOA calculation in a reverberant and noisy environment. Our methods are based on the speech quasi-stationarity property, noise stationarity and on the fact that the speech and the noise are uncorrelated. The mathematical derivations in this work are followed by an extensive experimental study which involves static and tracking scenarios.

Keywords

TDOA decorrelation Non-stationarity Source localization