Optimal control of nonlinear discrete time-varying systems using a new neural network approximation structure

Article ID	Journal	Published Year	Pages	File Type
406247	Neurocomputing	2015	9 Pages	PDF

Abstract

In this paper motivated by recently discovered neurocognitive models of mechanisms in the brain, a new reinforcement learning (RL) method is presented based on a novel critic neural network (NN) structure to solve the optimal tracking problem of a nonlinear discrete time-varying system in an online manner. A multiple-model approach combined with an adaptive self-organizing map (ASOM) neural network is used to detect changes in the dynamics of the system. The number of sub-models is determined adaptively and grows once a mismatch between the stored sub-models and the new data is detected. By using the ASOM neural network, a novel value function approximation (VFA) scheme is presented. Each sub-model contributes into the value function based on a responsibility signal obtained by the ASOM. The responsibility signal determines how much each sub-model contributes to the general value function. Novel policy iteration and the value iteration algorithms are presented to find the optimal control for the partially-unknown nonlinear discrete time-varying systems in an online manner. Simulation results demonstrate the effectiveness of the proposed control scheme.

Keywords

Multiple-model Value function approximation Optimal control Reinforcement learning