Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
719649 | IFAC Proceedings Volumes | 2010 | 8 Pages |
Abstract
Regarding the fact that model-based reinforcement learning has a superior performance over traditional RL, in this paper, we extend traditional model-based reinforcement learning for a group of self-interested agents with consecutive action selection trying to find the optimal policy. Every single decision making situation is modeled as extensive form games with perfect information. A modified version of prioritized sweeping is proposed in which subgame perfect equilibrium point is the optimal joint action. Finally, we discuss the algorithm analytically, and provide a formal convergence proof.
Related Topics
Physical Sciences and Engineering
Engineering
Computational Mechanics