Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4942682 | Engineering Applications of Artificial Intelligence | 2017 | 19 Pages |
Abstract
In order to overcome numerical stability problems that inherently occur in the recursive least-squares (RLS)-based adaptive dynamic programming paradigms for online optimal control design, a novel method to promote improvements in the state-value function approximations for online algorithms of the discrete linear quadratic regulator (DLQR) control system design is proposed. The algorithms resulting from that methodology are embedded in actor-critic architectures based on heuristic dynamic programming (HDP). The proposed solution is grounded on unitary transformations and QR decomposition, which are integrated in the critic network, to improve the RLS learning efficiency for online realization of the HDP-DLQR control design. In terms of numerical stability and computational cost, the developed learning strategy is designed to provide computational performance improvements, which aim at making possible the real time implementations of optimal control design methodology based upon actor-critic reinforcement learning paradigms. The convergence behavior and numerical stability of the proposed online algorithm are evaluated by computational simulations in two multiple-input and multiple-output models that represent a fourth order RLC circuit with two input voltages and two controllable voltage levels, and a doubly-fed induction generator with six inputs and six outputs for wind energy conversion systems.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Ernesto F.M. Ferreira, PatrÃcia H.M. Rêgo, João V.F. Neto,