Improved bound on the worst case complexity of Policy Iteration

Article ID	Journal	Published Year	Pages	File Type
1142229	Operations Research Letters	2016	6 Pages	PDF

Abstract

Solving Markov Decision Processes is a recurrent task in engineering which can be performed efficiently in practice using the Policy Iteration algorithm. Regarding its complexity, both lower and upper bounds are known to be exponential (but far apart) in the size of the problem. In this work, we provide the first improvement over the now standard upper bound from Mansour and Singh (1999). We also show that this bound is tight for a natural relaxation of the problem.

Keywords

Policy iteration Markov decision process Complexity

Related Topics

Physical Sciences and Engineering Mathematics Discrete Mathematics and Combinatorics

Preview

Improved bound on the worst case complexity of Policy Iteration

Authors

Romain Hollanders, Balázs Gerencsér, Jean-Charles Delvenne, Raphaël M. Jungers,