Article ID Journal Published Year Pages File Type
4618494 Journal of Mathematical Analysis and Applications 2011 9 Pages PDF
Abstract

This paper is concerned with the adaptive control problem, over the infinite horizon, for partially observable Markov decision processes whose transition functions are parameterized by an unknown vector. We treat finite models and impose relatively mild assumptions on the transition function. Provided that a sequence of parameter estimates converging in probability to the true parameter value is available, we show that the certainty equivalence adaptive policy is optimal in the long-run average sense.

Related Topics
Physical Sciences and Engineering Mathematics Analysis