The average cost of Markov chains subject to total variation distance uncertainty

Article ID	Journal	Published Year	Pages	File Type
10127504	Systems & Control Letters	2018	7 Pages	PDF

Abstract

This paper addresses the problem of controlling a Markov chain so as to minimize the long-run expected average cost per unit time when the invariant distribution is unknown but we know it belongs to a given uncertain set. The mathematical model used to describe this set is the total variation distance uncertainty. We show that the equilibrium control policy, which yields higher probability to the states with low cost and lower probability to the states with the high cost, is an optimal control policy that minimizes the average cost. Recognition of such a policy may be of value in practical situations with constraints consistent to those studied here when the invariant distribution is uncertain and deriving online an optimal control policy is required.

Keywords

Total variation distance Average cost Stochastic optimal control Controlled Markov chain