Gaussian process dynamic programming

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
410612	679154	2009	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Dynamic programming - برنامه‌ریزی پویا یا برنامه‌ نویسی پویا Gaussian processes - فرآیندهای گاوسی Optimal control - کنترل بهینه Reinforcement learning - یادگیری تقویتی Policy learning - یادگیری سیاست

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

چکیده انگلیسی

Reinforcement learning (RL) and optimal control of systems with continuous states and actions require approximation techniques in most interesting cases. In this article, we introduce Gaussian process dynamic programming (GPDP), an approximate value function-based RL algorithm. We consider both a classic optimal control problem, where problem-specific prior knowledge is available, and a classic RL problem, where only very general priors can be used. For the classic optimal control problem, GPDP models the unknown value functions with Gaussian processes and generalizes dynamic programming to continuous-valued states and actions. For the RL problem, GPDP starts from a given initial state and explores the state space using Bayesian active learning. To design a fast learner, available data have to be used efficiently. Hence, we propose to learn probabilistic models of the a priori unknown transition dynamics and the value functions on the fly. In both cases, we successfully apply the resulting continuous-valued controllers to the under-actuated pendulum swing up and analyze the performances of the suggested algorithms. It turns out that GPDP uses data very efficiently and can be applied to problems, where classic dynamic programming would be cumbersome.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 72, Issues 7–9, March 2009, Pages 1508–1524

نویسندگان

Marc Peter Deisenroth, Carl Edward Rasmussen, Jan Peters,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Gaussian process dynamic programming

دسترسی سریع

ارتباط

English Website