Article ID Journal Published Year Pages File Type
1142980 Operations Research Letters 2009 4 Pages PDF
Abstract

In the context of finite weakly communicating Markov Decision Processes, we tackle the problem of fast convergence of state-action frequency vectors to the polytope of stationary distributions on state-action frequencies. Using unichain policies, we derive bounds on the speed of convergence which are independent of the limit points.

Related Topics
Physical Sciences and Engineering Mathematics Discrete Mathematics and Combinatorics
Authors
,