Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
695396 | Automatica | 2015 | 4 Pages |
Abstract
This communique first presents a novel multi-policy improvement method which generates a feasible policy at least as good as any policy in a given set of feasible policies in finite constrained Markov decision processes (CMDPs). A random search algorithm for finding an optimal feasible policy for a given CMDP is derived by properly adapting the improvement method. The algorithm alleviates the major drawback of solving unconstrained MDPs at iterations in the existing value-iteration and policy-iteration type exact algorithms. We establish that the sequence of feasible policies generated by the algorithm converges to an optimal feasible policy with probability one and has a probabilistic exponential convergence rate.
Related Topics
Physical Sciences and Engineering
Engineering
Control and Systems Engineering
Authors
Hyeong Soo Chang,