Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1142980 | Operations Research Letters | 2009 | 4 Pages |
Abstract
In the context of finite weakly communicating Markov Decision Processes, we tackle the problem of fast convergence of state-action frequency vectors to the polytope of stationary distributions on state-action frequencies. Using unichain policies, we derive bounds on the speed of convergence which are independent of the limit points.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Discrete Mathematics and Combinatorics
Authors
Mathieu Tracol,