Article ID Journal Published Year Pages File Type
925415 Brain and Language 2012 11 Pages PDF
Abstract

The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the ‘Cocktail Party’ effect. Yet, the neural mechanisms underlying on-line speech decoding and attentional stream selection are not well understood. We review findings from behavioral and neurophysiological investigations that underscore the importance of the temporal structure of speech for achieving these perceptual feats. We discuss the hypothesis that entrainment of ambient neuronal oscillations to speech’s temporal structure, across multiple time-scales, serves to facilitate its decoding and underlies the selection of an attended speech stream over other competing input. In this regard, speech decoding and attentional stream selection are examples of ‘Active Sensing’, emphasizing an interaction between proactive and predictive top-down modulation of neuronal dynamics and bottom-up sensory input.

► The temporal structure of speech is critical for speech intelligibility and stream segregation. ► Neuronal oscillations entrain to rhythms in speech, across multiple time scales. ► Entrainment may serve to enhance the representation of an attended speaker at a ‘Cocktail Party’. ► Speech decoding and attentional stream selection are prime examples of ‘Active Sensing’.

Related Topics
Life Sciences Neuroscience Biological Psychiatry
Authors
, , ,