Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
7108333	1460620	2018	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

ADP Zero-sum games Q-learning Output feedback - بازخورد خروجی H-infinity control - کنترل H-infinity Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش صفحه اول مقاله

Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control

چکیده انگلیسی

Approximate dynamic programming techniques usually rely on the feedback of the measurement of the complete state, which is generally not available in practical situations. In this paper, we present an output feedback Q-learning algorithm towards finding the optimal strategies for the discrete-time linear quadratic zero-sum game, which encompasses the H-infinity optimal control problem. A new representation of the Q-function in the output feedback form is derived for the zero-sum game problem and the optimal output feedback policies are presented. Then, a Q-learning algorithm is developed that learns the optimal control strategies online without needing any information about the system dynamics, which makes the control design completely model-free. It is shown that the proposed algorithm converges to the optimal solution obtained by solving the game algebraic Riccati equation (GARE). Unlike the value function based approach used for output feedback, the proposed Q-learning scheme does not require a discounting factor that is generally adopted to mitigate the effect of excitation noise bias. It is known that this discounting factor may compromise the closed-loop stability. The proposed method overcomes the excitation noise bias problem without resorting to the discounting factor, and therefore, converges to the nominal GARE solution. As a result, the closed-loop stability is preserved. An application to the H-infinity autopilot controller for the F-16 aircraft is demonstrated by simulation.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 95, September 2018, Pages 213-221

نویسندگان

Syed Ali Asad Rizvi, Zongli Lin,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control

دسترسی سریع

ارتباط

English Website