Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10127070 | Neural Networks | 2018 | 36 Pages |
Abstract
Operant learning is learning based on reinforcement of behaviours. We propose a new hypothesis for operant learning at the single neuron level based on spontaneous fluctuations of synaptic strength caused by receptor dynamics. These fluctuations allow the neural system to explore a space of outputs. If the receptor dynamics are altered by a reinforcement signal the neural system settles to better states, i.e., to match the environmental dynamics that determine reward. Simulations show that this mechanism can support operant learning in a feed-forward neural circuit, a recurrent neural circuit, and a spiking neural circuit controlling an agent learning in a dynamic reward and punishment situation. We discuss how the new principle relates to existing learning rules and observed phenomena of short and long-term potentiation.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Tianqi Wei, Barbara Webb,