Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
697016 | Automatica | 2009 | 11 Pages |
Abstract
In this paper, we study the nth-bias optimality problem for finite continuous-time Markov decision processes (MDPs) with a multichain structure. We first provide nth-bias difference formulas for two policies and present some interesting characterizations of an nth-bias optimal policy by using these difference formulas. Then, we prove the existence of an nth-bias optimal policy by using nth-bias optimal policy iteration algorithms, and show that such an nth-bias optimal policy can be obtained in a finite number of policy iterations.
Related Topics
Physical Sciences and Engineering
Engineering
Control and Systems Engineering
Authors
Junyu Zhang, Xi-Ren Cao,