کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4315003 | 1290059 | 2009 | 5 صفحه PDF | دانلود رایگان |
Modern theoretical accounts on reward-based learning are commonly based on reinforcement learning algorithms. Most noted in this context is the temporal-difference (TD) algorithm in which the difference between predicted and obtained reward, the prediction-error, serves as a learning signal. Consequently, larger rewards cause bigger prediction-errors and lead to faster learning than smaller rewards. Therefore, if animals employ a neural implementation of TD learning, reward-magnitude should affect learning in animals accordingly.Here we test this prediction by training pigeons on a simple color-discrimination task with two pairs of colors. In each pair, correct discrimination is rewarded; in pair one with a large-reward, in pair two with a small-reward. Pigeons acquired the ‘large-reward’ discrimination faster than the ‘small-reward’ discrimination. Animal behavior and an implementation of the TD-algorithm yielded comparable results with respect to the difference between learning curves in the large-reward and in the small-reward conditions. We conclude that the influence of reward-magnitude on the acquisition of a simple discrimination paradigm is accurately reflected by a TD implementation of reinforcement learning.
Journal: Behavioural Brain Research - Volume 198, Issue 1, 2 March 2009, Pages 125–129