The control of a two-level Markov decision process by time aggregation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
697870	890385	2006	11 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Time aggregation - تجمع زمان Policy iteration - تکرار سیاست Two-level systems - دو سطح سیستم Markov decision processes - پروسه تصمیم گیری مارکوف

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش صفحه اول مقاله

The control of a two-level Markov decision process by time aggregation

چکیده انگلیسی

The solution of Markov Decision Processes (MDPs) often relies on special properties of the processes. For two-level MDPs, the difference in the rates of state changes of the upper and lower levels has led to limiting or approximate solutions of such problems. In this paper, we solve a two-level MDP without making any assumption on the rates of state changes of the two levels. We first show that such a two-level MDP is a non-standard one where the optimal actions of different states can be related to each other. Then we give assumptions (conditions) under which such a specially constrained MDP can be solved by policy iteration. We further show that the computational effort can be reduced by decomposing the MDP. A two-level MDP with M upper-level states can be decomposed into one MDP for the upper level and M to M(M-1) MDPs for the lower level, depending on the structure of the two-level MDP. The upper-level MDP is solved by time aggregation, a technique introduced in a recent paper [Cao, X.-R., Ren, Z. Y., Bhatnagar, S., Fu, M., & Marcus, S. (2002). A time aggregation approach to Markov decision processes. Automatica, 38(6), 929-943.], and the lower-level MDPs are solved by embedded Markov chains.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 42, Issue 3, March 2006, Pages 393-403

نویسندگان

Yat-wah Wan, Xi-Ren Cao,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

The control of a two-level Markov decision process by time aggregation

دسترسی سریع

ارتباط

English Website