Article ID Journal Published Year Pages File Type
10321083 Cognitive Systems Research 2005 24 Pages PDF
Abstract
Synchrony detection between different sensory channels appears critically important for learning and cognitive development. In this paper we compare infant studies of audio-visual synchrony detection with a model of synchrony detection based on Gaussian mutual information [Hershey, J., & Movellan, J. (2000). Audio-vision: using audio-visual synchrony to locate sounds. In S. A. Solla, T. K. Leen, & K. R. Müller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 813-819). Cambridge, MA: MIT Press], augmented with methods for quantitative synchrony estimation. Five infant-model comparisons are presented, using stimuli covering a broad range of audio-visual integration types. While infants and the model showed discrimination of each type of stimuli, the model was most successful with stimuli comprised of (a) synchronized punctuate motion and speech, (b) visually balanced left and right instances of the same person talking but speech synchronized with only one side, and (c) two speech audio sources and a dynamic-face motion source. More difficult for the model were stimuli conditions with (d) left and right instances of two different people talking but speech synchronized with only one side, and (e) two speech audio sources and more abstract visual dynamics - an oscilloscope instead of a face. As a first approximation, this model of synchrony detection using low-level sensory features (e.g., RMS audio, grayscale pixels) is a candidate for a mechanism used by infants in detecting audio-visual synchrony.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, ,