دانلود رایگان مقاله: برنامه نویسی پویا چند مرحله ای برای کنترل بهینه از سیستم های گسسته غیر خطی

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4944331	1437987	2017	31 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems

ترجمه فارسی عنوان

برنامه نویسی پویا چند مرحله ای برای کنترل بهینه از سیستم های گسسته غیر خطی

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

کنترل بهینه، برنامه نویسی پویا چند مرحله ای، برنامه ریزی پویا سازگار، سیستم های غیر خطی، زمان گسسته شبکه های عصبی،

adaptive dynamic programming - برنامه ریزی پویا تطبیقی Discrete-time - زمان گسسته Nonlinear systems - سیستم غیرخطی Neural networks - شبکه های عصبی Optimal control - کنترل بهینه

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

برنامه نویسی پویا چند مرحله ای برای کنترل بهینه از سیستم های گسسته غیر خطی

چکیده انگلیسی

Policy iteration and value iteration are two main iterative adaptive dynamic programming frameworks for solving optimal control problems. Policy iteration converges fast while requiring an initial stabilizing control policy, which is a strict constraint in practice. Value iteration avoids the requirement of initial admissible control policy while converging much slowly. This paper tries to utilize the advantages of policy iteration and value iteration, and avoids their drawbacks at the same time. Therefore, a multi-step heuristic dynamic programming (MsHDP) method is developed for solving the optimal control problem of nonlinear discrete-time systems. MsHDP speeds up value iteration and avoids the requirement of initial admissible control policy in policy iteration at the same time. The convergence theory of MsHDP is established by proving that it converges to the solution of the Bellman equation. For implementation purpose, the actor-critic neural network (NN) structure is developed. The critic NN is employed to estimate the value function and its NN weight vector is computed with a least-square scheme. The actor NN is used to estimate the control policy and a gradient descent method is proposed for updating its NN weight vector. According to the comparative simulation studies on two examples, the effectiveness and advantages of MsHDP are verified.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 411, October 2017, Pages 66-83

نویسندگان

Biao Luo, Derong Liu, Tingwen Huang, Xiong Yang, Hongwen Ma,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : برنامه نویسی پویا چند مرحله ای برای کنترل بهینه از سیستم های گسسته غیر خطی

دسترسی سریع

ارتباط

English Website