Navigation with memory in a partially observable environment

Article ID	Journal	Published Year	Pages	File Type
413346	Robotics and Autonomous Systems	2006	11 Pages	PDF

Abstract

The paper presents an architecture that allows the reactive visual navigation via an unsupervised reinforcement learning. This objective is reached using QQ-learning and a hierarchical approach to the developed architecture. Using these techniques requires a deviation from the Partially Observable Markov Decision Processes (POMDP) and some innovations: heuristic techniques for generalizing the experience and for treating the partial observability; a technique for the speed adjournment of the QQ function; the definition of a special reinforcement policy adequate for learning a complex task without supervision. The result is a satisfactory learning of the navigation assignment in a simulated environment.

Keywords

Self-localization Navigation Hierarchical reinforcement learning