Finding the K best policies in a finite-horizon Markov decision process

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
482252	1446211	2006	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Hyperpaths Stochastic dynamic programming - برنامه ریزی پویا تصادفی Directed hypergraphs - هیپراگراف های هدایت شده

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)

پیش نمایش صفحه اول مقاله

Finding the K best policies in a finite-horizon Markov decision process

چکیده انگلیسی

Directed hypergraphs represent a general modelling and algorithmic tool, which have been successfully used in many different research areas such as artificial intelligence, database systems, fuzzy systems, propositional logic and transportation networks. However, modelling Markov decision processes using directed hypergraphs has not yet been considered.In this paper we consider finite-horizon Markov decision processes (MDPs) with finite state and action space and present an algorithm for finding the K best deterministic Markov policies. That is, we are interested in ranking the first K deterministic Markov policies in non-decreasing order using an additive criterion of optimality. The algorithm uses a directed hypergraph to model the finite-horizon MDP. It is shown that the problem of finding the optimal policy can be formulated as a minimum weight hyperpath problem and be solved in linear time, with respect to the input data representing the MDP, using different additive optimality criteria.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: European Journal of Operational Research - Volume 175, Issue 2, 1 December 2006, Pages 1164–1179

نویسندگان

Lars Relund Nielsen, Anders Ringgaard Kristensen,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Finding the K best policies in a finite-horizon Markov decision process

دسترسی سریع

ارتباط

English Website