Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
731934	893188	2014	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Robotics - رباتیک Local linear regression - رگرسیون خطی محلی Process model - مدل فرآیند Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش صفحه اول مقاله

Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy

چکیده انگلیسی

Reinforcement learning (RL) is a framework that enables a controller to find an optimal control policy for a task in an unknown environment. Although RL has been successfully used to solve optimal control problems, learning is generally slow. The main causes are the inefficient use of information collected during interaction with the system and the inability to use prior knowledge on the system or the control task. In addition, the learning speed heavily depends on the learning rate parameter, which is difficult to tune. In this paper, we present a sample-efficient, learning-rate-free version of the Value-Gradient Based Policy (VGBP) algorithm. The main difference between VGBP and other frequently used algorithms, such as Sarsa, is that in VGBP the learning agent has a direct access to the reward function, rather than just the immediate reward values. Furthermore, the agent learns a process model. This enables the algorithm to select control actions by optimizing over the right-hand side of the Bellman equation. We demonstrate the fast learning convergence in simulations and experiments with the underactuated pendulum swing-up task. In addition, we present experimental results for a more complex 2-DOF robotic manipulator.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Mechatronics - Volume 24, Issue 8, December 2014, Pages 966–974

نویسندگان

J.C. van Rooijen, I. Grondman, R. Babuška,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy

دسترسی سریع

ارتباط

English Website