کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
392555 664777 2014 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Reinforcement learning with automatic basis construction based on isometric feature mapping
ترجمه فارسی عنوان
تقویت یادگیری با ساختار پایه اتوماتیک براساس نقشه برداری ایزومتریک
کلمات کلیدی
تقویت یادگیری، نقشه برداری ویژگی های ایزومتریک، تقریب تابع ارزش، تقریبا تکرار سیاست، کنترل یادگیری
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Value function approximation (VFA) has been a major research topic in reinforcement learning. Although various reinforcement learning algorithms with VFA have been proposed, the performance of most previous algorithms depends on the predefined structure of the basis functions. To address this problem, this paper presents a novel basis learning method for VFA based on isometric feature mapping (IFM). In the proposed method, basis functions for VFA are automatically generated by constructing the optimal embedding basis of the data in a d-dimensional Euclidean space, which best preserves the estimated intrinsic geometry of the manifold. Furthermore, the IFM-based basis learning method is integrated with approximation policy iteration (API) for learning control in Markov decision problems with large state spaces. A new manifold reinforcement learning framework termed IFM-based API (IFM-API) is presented. Three learning control problems, including a real control system of the Googol single inverted pendulum, were studied to evaluate the performance of the proposed IFM-API algorithm. The simulation and experimental results show that, compared with other basis selection or learning methods, the IFM-based basis learning method can automatically compute an efficient set of basis functions with much fewer predefined parameters and less computational costs. Besides, it is illustrated that the proposed IFM-API algorithm can obtain better learning control policies than other API methods.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 286, 1 December 2014, Pages 209–227
نویسندگان
, , ,