کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
406247 | 678075 | 2015 | 9 صفحه PDF | دانلود رایگان |
In this paper motivated by recently discovered neurocognitive models of mechanisms in the brain, a new reinforcement learning (RL) method is presented based on a novel critic neural network (NN) structure to solve the optimal tracking problem of a nonlinear discrete time-varying system in an online manner. A multiple-model approach combined with an adaptive self-organizing map (ASOM) neural network is used to detect changes in the dynamics of the system. The number of sub-models is determined adaptively and grows once a mismatch between the stored sub-models and the new data is detected. By using the ASOM neural network, a novel value function approximation (VFA) scheme is presented. Each sub-model contributes into the value function based on a responsibility signal obtained by the ASOM. The responsibility signal determines how much each sub-model contributes to the general value function. Novel policy iteration and the value iteration algorithms are presented to find the optimal control for the partially-unknown nonlinear discrete time-varying systems in an online manner. Simulation results demonstrate the effectiveness of the proposed control scheme.
Journal: Neurocomputing - Volume 156, 25 May 2015, Pages 157–165