Article ID Journal Published Year Pages File Type
5774521 Journal of Mathematical Analysis and Applications 2017 8 Pages PDF
Abstract
We prove that a finite (state and action spaces) semi-Markov decision process with limiting ratio average (undiscounted) payoff has an optimal pure semi-stationary policy (i.e., a semi-Markov policy independent of decision epoch count). We conclude by showing (with the aid of an example) that the result cannot be strengthened further. A crude but finite step algorithm is given to compute such an optimal policy.
Related Topics
Physical Sciences and Engineering Mathematics Analysis
Authors
, ,