An Autonomous Distal Reward Learning Architecture for Embodied Agents

Article ID	Journal	Published Year	Pages	File Type
488943	Procedia Computer Science	2012	7 Pages	PDF

Abstract

Distal reward refers to a class of problems where reward is temporally distal from actions that lead to reward. The difficulty for any biological neural system is that the neural activations that caused an agent to achieve reward may no longer be present when the reward is experienced. Therefore in addition to the usual reward assignment problem, there is the additional complexity of rewarding through time based on neural activations that may no longer be present. Although this problem has been thoroughly studied over the years using methods such as reinforcement learning, we are interested in a more biologically motivated neural architectural approach. This paper introduces one such architecture that exhibits rudimentary distal reward learning based on associations of bottom-up visual sensory sequences with bottom-up proprioceptive motor sequences while an agent explores an environment. After sufficient learning, the agent is able to locate the reward through chaining together of top-down motor command sequences. This paper will briefly discuss the details of the neural architecture, the agent-based modeling system in which it is embodied, a virtual Morris water maze environment used for training and evaluation, and a sampling of numerical experiments characterizing its learning properties.