Approximate stochastic annealing for online control of infinite horizon Markov decision processes

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
696923	890352	2012	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Algorithms - الگوریتم ها Stochastic approximation - تقریبی تصادفی Markov decision process - روند تصمیم گیری مارکوف Simulation - شبیه سازی

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش صفحه اول مقاله

Approximate stochastic annealing for online control of infinite horizon Markov decision processes

چکیده انگلیسی

We present an online simulation-based algorithm called Approximate Stochastic Annealing (ASA) for solving infinite-horizon finite state-action space Markov decision processes (MDPs). The algorithm estimates the optimal policy by sampling at each iteration from a probability distribution function over the policy space, which is updated iteratively based on the Q-function estimates obtained via a recursion of Q-learning type. By exploiting a novel connection of ASA to the stochastic approximation method, we show that the sequence of distribution functions generated by the algorithm converges to a degenerated distribution that concentrates only on the optimal policy. Numerical examples are also provided to illustrate the algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 48, Issue 9, September 2012, Pages 2182-2188

نویسندگان

Jiaqiao Hu, Hyeong Soo Chang,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Approximate stochastic annealing for online control of infinite horizon Markov decision processes

دسترسی سریع

ارتباط

English Website