Model-based Reinforcement Learning in Multiagent Systems in Extensive Forms with Perfect Information

Article ID	Journal	Published Year	Pages	File Type
719649	IFAC Proceedings Volumes	2010	8 Pages	PDF

Abstract

Regarding the fact that model-based reinforcement learning has a superior performance over traditional RL, in this paper, we extend traditional model-based reinforcement learning for a group of self-interested agents with consecutive action selection trying to find the optimal policy. Every single decision making situation is modeled as extensive form games with perfect information. A modified version of prioritized sweeping is proposed in which subgame perfect equilibrium point is the optimal joint action. Finally, we discuss the algorithm analytically, and provide a formal convergence proof.