Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4618422 | Journal of Mathematical Analysis and Applications | 2011 | 13 Pages |
Abstract
In this paper, we study discounted Markov decision processes on an uncountable state space. We allow a utility (reward) function to be unbounded both from above and below. A new feature in our approach is an easily verifiable rate of growth condition introduced for a positive part of the utility function. This assumption, in turn, enables us to prove the convergence of a value iteration algorithm to a solution to the Bellman equation. Moreover, by virtue of the optimality equation we show the existence of an optimal stationary policy.
Related Topics
Physical Sciences and Engineering
Mathematics
Analysis