Article ID Journal Published Year Pages File Type
4618422 Journal of Mathematical Analysis and Applications 2011 13 Pages PDF
Abstract

In this paper, we study discounted Markov decision processes on an uncountable state space. We allow a utility (reward) function to be unbounded both from above and below. A new feature in our approach is an easily verifiable rate of growth condition introduced for a positive part of the utility function. This assumption, in turn, enables us to prove the convergence of a value iteration algorithm to a solution to the Bellman equation. Moreover, by virtue of the optimality equation we show the existence of an optimal stationary policy.

Related Topics
Physical Sciences and Engineering Mathematics Analysis