A note on deterministic approximation of discounted Markov decision processes

Article ID	Journal	Published Year	Pages	File Type
1709036	Applied Mathematics Letters	2009	5 Pages	PDF

Abstract

We study the approximation of a small-noise Markov decision process xt=F(xt−1,at,ξt(ϵ))xt=F(xt−1,at,ξt(ϵ)), t=1,2,…t=1,2,… by means of its deterministic counterpart: x˜t=F(x˜t−1,at,s0), t=1,2,…t=1,2,… where s0s0 is a fixed point of the disturbance metric space (S,r)(S,r). The total discounted cost is used as a criterion of optimality. Supposing that δϵ≔Er(ξ1(ϵ),s0)→0δϵ≔Er(ξ1(ϵ),s0)→0 as ϵ→0ϵ→0, we prove the convergence of optimal policies, estimate the rate of convergence of the optimal costs and give an upper bound (depending on δϵδϵ) for the stability index, which measures the excess of the cost due to a replacement of the optimal policy by its deterministic approximation.

Keywords

Deterministic approximation Markov decision process Rate of convergence