کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
715507 892204 2014 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
LP-Based Approaches to Stationary-Constrained Markov Decision Problems
موضوعات مرتبط
مهندسی و علوم پایه سایر رشته های مهندسی مکانیک محاسباتی
پیش نمایش صفحه اول مقاله
LP-Based Approaches to Stationary-Constrained Markov Decision Problems
چکیده انگلیسی

We study a class of Markov Decision Processes (MDPs) under stationary constraints (particularly but not exclusively, chance constraints). Our model focuses on problems where the state space is finite but the control at each state may take real values. It is straightforward to formulate the problem as a constrained nonlinear optimization problem, but solving it requires either projection, penalty or gradient-based methods that may show slow convergence, particularly because this type of problems may not be strictly convex.In this paper we extend results that are known for the discrete control model and show that the solution of a binary randomized control problem is optimal under the assumption that the dependency on the control variable is linear. If the dependency is piecewise linear with several breakpoints, then the randomized problem is not equivalent to the original problem; however we show that a particular solution always exists that is also the solution to the original continuous problem. This special solution takes the form of a two-action control policy for each state. Solving each two-action randomized problem can be done in polynomial time by linear programming (LP). This leads to an efficient solution when the number of states is small, but for a large state space the complexity can be prohibitive. In a restricted yet interesting special case of piecewise linear, multiple actions setting where the actions are linearly ordered, we show that the problem can again be reduced to a single LP.To overcome the computational complexity in the general case, we propose an alternative approach that involves solving a sequence of linear programs modeling the stationary measure and the optimal solution concurrently. We conclude the paper by discussing future extensions of this framework to application settings where the state space is continuous as well.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: IFAC Proceedings Volumes - Volume 47, Issue 2, 2014, Pages 320-325