دانلود رایگان مقاله: بهینه سازی میانگین واریانس زمان گسسته با استفاده از فرآیندهای تصمیم مارکوف مخالف است

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
7109102	1460627	2018	7 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Mean-variance optimization of discrete time discounted Markov decision processes

ترجمه فارسی عنوان

بهینه سازی میانگین واریانس زمان گسسته با استفاده از فرآیندهای تصمیم مارکوف مخالف است

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Markov decision process - روند تصمیم گیری مارکوف

موضوعات مرتبط

مهندسی و علوم پایه سایر رشته های مهندسی کنترل و سیستم های مهندسی

پیش نمایش مقاله

بهینه سازی میانگین واریانس زمان گسسته با استفاده از فرآیندهای تصمیم مارکوف مخالف است

چکیده انگلیسی

In this paper, we study a mean-variance optimization problem in an infinite horizon discrete time discounted Markov decision process (MDP). The objective is to minimize the variance of system rewards with the constraint of mean performance. Different from most of works in the literature which require the mean performance already achieve optimum, we can let the discounted performance equal any constant. The difficulty of this problem is caused by the quadratic form of the variance function which makes the variance minimization problem not a standard MDP. By proving the decomposable structure of the feasible policy space, we transform this constrained variance minimization problem to an equivalent unconstrained MDP under a new discounted criterion and a new reward function. The difference of the variances of Markov chains under any two feasible policies is quantified by a difference formula. Based on the variance difference formula, a policy iteration algorithm is developed to find the optimal policy. We also prove the optimality of deterministic policy over the randomized policy generated in the mean-constrained policy space. Numerical experiments demonstrate the effectiveness of our approach.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Automatica - Volume 88, February 2018, Pages 76-82

نویسندگان

Li Xia,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : بهینه سازی میانگین واریانس زمان گسسته با استفاده از فرآیندهای تصمیم مارکوف مخالف است

دسترسی سریع

ارتباط

English Website