Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
695511 | Automatica | 2014 | 5 Pages |
Abstract
This paper focuses on the design of time-homogeneous fully observed Markov decision processes (MDPs), with finite state and action spaces. The main objective is to obtain policies that generate the maximal set of recurrent states, subject to convex constraints on the set of invariant probability mass functions. We propose a design method that relies on a finitely parametrized convex program inspired on principles of entropy maximization. A numerical example is provided to illustrate these ideas.
Related Topics
Physical Sciences and Engineering
Engineering
Control and Systems Engineering
Authors
Eduardo Arvelo, Nuno C. Martins,