Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
9503149 | Journal of Mathematical Analysis and Applications | 2005 | 10 Pages |
Abstract
Iwamoto recently established a formal transformation via an invariant imbedding to construct a controlled Markov chain that can be solved in a backward manner, as in backward induction for finite-horizon Markov decision processes (MDPs), for a given controlled Markov chain with non-additive forward recursive objective function criterion. Chang et al. presented formal methods, called “parallel rollout” and “policy switching,” of combining given multiple policies in MDPs and showed that the policies generated by both methods improve all of the policies that the methods combine. This brief paper extends the methods of parallel rollout and policy switching for forward recursive objective function criteria and shows that the similar property holds as in MDPs. We further discuss how to implement these methods via simulation.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Analysis
Authors
Hyeong Soo Chang,