دانلود رایگان مقاله: طرح های برنامه ریزی پویای سازگار با داده ها برای بازی های غیر صفر از سیستم های غیر خطی زمان گسسته ناشناخته

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6864906	1439552	2018	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Data-driven adaptive dynamic programming schemes for non-zero-sum games of unknown discrete-time nonlinear systems

ترجمه فارسی عنوان

طرح های برنامه ریزی پویای سازگار با داده ها برای بازی های غیر صفر از سیستم های غیر خطی زمان گسسته ناشناخته

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

تقویت یادگیری، برنامه ریزی پویا سازگار، هدایت داده، بازی های غیر صفر، شبکه های عصبی،

adaptive dynamic programming - برنامه ریزی پویا تطبیقی Neural networks - شبکه های عصبی Data-driven - هدایت داده Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

طرح های برنامه ریزی پویای سازگار با داده ها برای بازی های غیر صفر از سیستم های غیر خطی زمان گسسته ناشناخته

چکیده انگلیسی

This paper integrates game theory, optimal control theory and reinforcement learning to deal with the discrete-time (DT) multi-player non-zero-sum game issue. As is known, the solutions to non-zero-sum game problems are the outcomes of coupled Riccati equations or coupled Hamilton-Jacobi ones, which are generally difficult to solve analytically and require the knowledge of accurate system mathematical models. However, for most practical industrial systems, the system dynamics cannot be obtained accurately or even unavailable, and the conventional model-based methods will be invalid. To overcome this deficiency, we develop data-based adaptive dynamic programming (ADP) algorithms for completely unknown multi-player systems. Firstly, the Nash equilibrium and stationarity conditions are used to formulate the DT multi-player non-zero-sum game, and then policy iteration algorithm is applied to approximate optimal solutions successively. Secondly, a novel online ADP algorithm combined with a neural-network-based identification scheme is designed and only requires the system data instead of the real system models. Subsequently, a data-driven action-dependent heuristic dynamic programming approach is presented and circumvents the estimation errors caused by the identification learning procedure. Finally, two simulation examples are provided to illustrate the feasibility of our schemes.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 275, 31 January 2018, Pages 649-658

نویسندگان

He Jiang, Huaguang Zhang, Kun Zhang, Xiaohong Cui,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : طرح های برنامه ریزی پویای سازگار با داده ها برای بازی های غیر صفر از سیستم های غیر خطی زمان گسسته ناشناخته

دسترسی سریع

ارتباط

English Website