دانلود رایگان مقاله: در تکرار سیاست انتگرال عمومی برای مقادیر خطی درجه یک خطی مستمر؟

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
695927	890318	2014	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

On integral generalized policy iteration for continuous-time linear quadratic regulations

ترجمه فارسی عنوان

در تکرار سیاست انتگرال عمومی برای مقادیر خطی درجه یک خطی مستمر؟

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

LQR Optimization under uncertainties - بهینه سازی تحت عدم اطمینان Adaptive control - کنترل تطبیقی Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش مقاله

در تکرار سیاست انتگرال عمومی برای مقادیر خطی درجه یک خطی مستمر؟

چکیده انگلیسی

This paper mathematically analyzes the integral generalized policy iteration (I-GPI) algorithms applied to a class of continuous-time linear quadratic regulation (LQR) problems with the unknown system matrix AA. GPI is the general idea of interacting policy evaluation and policy improvement steps of policy iteration (PI), for computing the optimal policy. We first introduce the update horizon ℏℏ, and then show that (i) all of the I-GPI methods with the same ℏℏ can be considered equivalent and that (ii) the value function approximated in the policy evaluation step monotonically converges to the exact one as ℏ→∞ℏ→∞. This reveals the relation between the computational complexity and the update (or time) horizon of I-GPI as well as between I-PI and I-GPI in the limit ℏ→∞ℏ→∞. We also provide and discuss two modes of convergence of I-GPI; I-GPI behaves like PI in one mode, and in the other mode, it performs like value iteration for discrete-time LQR and infinitesimal GPI (ℏ→0ℏ→0). From these results, a new classification of the integral reinforcement learning is formed with respect to ℏℏ. Two matrix inequality conditions for stability, the region of local monotone convergence, and data-driven (adaptive) implementation methods are also provided with detailed discussion. Numerical simulations are carried out for verification and further investigations.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 50, Issue 2, February 2014, Pages 475–489

نویسندگان

Jae Young Lee, Jin Bae Park, Yoon Ho Choi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : در تکرار سیاست انتگرال عمومی برای مقادیر خطی درجه یک خطی مستمر؟

دسترسی سریع

ارتباط

English Website