دانلود رایگان مقاله: چگونه رین در بازیگر ناپایدار: چشم انداز محدود جدید؟

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
485161	703313	2014	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

How to Rein in the Volatile Actor: A New Bounded Perspective

ترجمه فارسی عنوان

چگونه رین در بازیگر ناپایدار: چشم انداز محدود جدید؟

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)

پیش نمایش مقاله

چگونه رین در بازیگر ناپایدار: چشم انداز محدود جدید؟

چکیده انگلیسی

Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms that can be used to solve Markov decision processes (MDPs) via simulation. Unfortunately, the parameters of the so-called “actor” in the classical actor-critic algorithm exhibit great volatility — getting unbounded in practice, whence they have to be artificially constrained to obtain solutions in practice. The algorithm is often used in conjunction with Boltzmann action selection, where one may have to use a temperature to get the algorithm to work, but the convergence of the algorithm has only been proved when the temperature equals 1. We propose a new actor-critic algorithm whose actor's parameters are bounded. We present a mathematical proof of the boundedness and test our algorithm on small-scale MDPs for infinite horizon discounted reward. Our algorithm produces encouraging numerical results.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 36, 2014, Pages 500-507

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : چگونه رین در بازیگر ناپایدار: چشم انداز محدود جدید؟

دسترسی سریع

ارتباط

English Website