کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
377399 658418 2008 29 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
چکیده انگلیسی

This work presents the restricted gradient-descent (RGD) algorithm, a training method for local radial-basis function networks specifically developed to be used in the context of reinforcement learning. The RGD algorithm can be seen as a way to extract relevant features from the state space to feed a linear model computing an approximation of the value function. Its basic idea is to restrict the way the standard gradient-descent algorithm changes the hidden units of the approximator, which results in conservative modifications that make the learning process less prone to divergence. The algorithm is also able to configure the topology of the network, an important characteristic in the context of reinforcement learning, where the changing policy may result in different requirements on the approximator structure. Computational experiments are presented showing that the RGD algorithm consistently generates better value-function approximations than the standard gradient-descent method, and that the latter is more susceptible to divergence. In the pole-balancing and Acrobot tasks, RGD combined with SARSA presents competitive results with other methods found in the literature, including evolutionary and recent reinforcement-learning algorithms.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence - Volume 172, Issues 4–5, March 2008, Pages 454-482