Article ID Journal Published Year Pages File Type
10127504 Systems & Control Letters 2018 7 Pages PDF
Abstract
This paper addresses the problem of controlling a Markov chain so as to minimize the long-run expected average cost per unit time when the invariant distribution is unknown but we know it belongs to a given uncertain set. The mathematical model used to describe this set is the total variation distance uncertainty. We show that the equilibrium control policy, which yields higher probability to the states with low cost and lower probability to the states with the high cost, is an optimal control policy that minimizes the average cost. Recognition of such a policy may be of value in practical situations with constraints consistent to those studied here when the invariant distribution is uncertain and deriving online an optimal control policy is required.
Related Topics
Physical Sciences and Engineering Engineering Control and Systems Engineering
Authors
, , ,