Building a state space for song learning

Article ID	Journal	Published Year	Pages	File Type
8840126	Current Opinion in Neurobiology	2018	10 Pages	PDF

Abstract

The songbird system has shed light on how the brain produces precisely timed behavioral sequences, and how the brain implements reinforcement learning (RL). RL is a powerful strategy for learning what action to produce in each state, but requires a unique representation of the states involved in the task. Songbird RL circuitry is thought to operate using a representation of each moment within song syllables, consistent with the sparse sequential bursting of neurons in premotor cortical nucleus HVC. However, such sparse sequences are not present in very young birds, which sing highly variable syllables of random lengths. Here, we review and expand upon a model for how the songbird brain could construct latent sequences to support RL, in light of new data elucidating connections between HVC and auditory cortical areas. We hypothesize that learning occurs via four distinct plasticity processes: 1) formation of 'tutor memory' sequences in auditory areas; 2) formation of appropriately-timed latent HVC sequences, seeded by inputs from auditory areas spontaneously replaying the tutor song; 3) strengthening, during spontaneous replay, of connections from HVC to auditory neurons of corresponding timing in the 'tutor memory' sequence, aligning auditory and motor representations for subsequent song evaluation; and 4) strengthening of connections from premotor neurons to motor output neurons that produce the desired sounds, via well-described song RL circuitry.