Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5774521 | Journal of Mathematical Analysis and Applications | 2017 | 8 Pages |
Abstract
We prove that a finite (state and action spaces) semi-Markov decision process with limiting ratio average (undiscounted) payoff has an optimal pure semi-stationary policy (i.e., a semi-Markov policy independent of decision epoch count). We conclude by showing (with the aid of an example) that the result cannot be strengthened further. A crude but finite step algorithm is given to compute such an optimal policy.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Analysis
Authors
Sagnik Sinha, Prasenjit Mondal,