دانلود رایگان مقاله: بهینه سازی فرآیند تصمیم گیری مارکوف تحت معیار واریانس

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4999930	1460642	2016	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Optimization of Markov decision processes under the variance criterion

ترجمه فارسی عنوان

بهینه سازی فرآیند تصمیم گیری مارکوف تحت معیار واریانس

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

روند تصمیم گیری مارکوف، معیار واریانس، بهینه سازی مبتنی بر حساسیت، تکرار سیاست، گرادیان سیاست،

Policy iteration - تکرار سیاست Markov decision process - روند تصمیم گیری مارکوف Policy gradient - شیب خط مشی

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش مقاله

بهینه سازی فرآیند تصمیم گیری مارکوف تحت معیار واریانس

چکیده انگلیسی

In this paper, we study a variance minimization problem in an infinite stage discrete time Markov decision process (MDP), regardless of the mean performance. For the Markov chain under the variance criterion, since the value of the cost function at the current stage will be affected by future actions, this problem is not a standard MDP and the traditional MDP theory is not applicable. In this paper, we convert the variance minimization problem into a standard MDP by introducing a concept called pseudo variance. Then we derive a variance difference formula that quantifies the difference of variances of Markov systems under any two policies. With the difference formula, the correlation of the variance cost function at different stages can be decoupled through a nonnegative term. A necessary condition of the optimal policy is obtained. It is also proved that the optimal policy with the minimal variance can be found in the deterministic policy space. Furthermore, we propose an efficient iterative algorithm to reduce the variance of Markov systems. We prove that this algorithm can converge to a local optimum. Finally, a numerical experiment is conducted to demonstrate the efficiency of our algorithm compared with the gradient-based method widely adopted in the literature.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 73, November 2016, Pages 269-278

نویسندگان

Li Xia,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : بهینه سازی فرآیند تصمیم گیری مارکوف تحت معیار واریانس

دسترسی سریع

ارتباط

English Website