On functional equations for KKth best policies in Markov decision processes

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
696801	890347	2013	4 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Dynamic programming - برنامه‌ریزی پویا یا برنامه‌ نویسی پویا Ranks - صفات Markov decision processes - پروسه تصمیم گیری مارکوف

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش صفحه اول مقاله

On functional equations for KKth best policies in Markov decision processes

چکیده انگلیسی

This paper revisits the problem of finding the values of KKth best policies for finite-horizon finite Markov decision processes. The recursive dynamic-programming (DP) equations established by Bellman and Kalaba for non-deterministic MDPs with zero-cost function in [Bellman, R., & Kalaba, R. (1960). On kth best policies. Journal of SIAM, 8, 582–588] are incomplete because expectation and selection for the KKth minimum do not interchange in general. Based on the DP equations by Dreyfus for the KKth shortest path problem, some non-DP equations generally satisfied by the values of the KKth best policies are identified, from which corrected Bellman and Kalaba’s DP equations are derived with an appropriate sufficient condition.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 49, Issue 1, January 2013, Pages 297–300

نویسندگان

Hyeong Soo Chang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

On functional equations for KKth best policies in Markov decision processes

دسترسی سریع

ارتباط

English Website