On the bias of batch Bellman residual minimisation

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
410664	679154	2008	4 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Learning theory - تئوری یادگیری Reinforcement learning - یادگیری تقویتی Machine learning - یادگیری ماشین

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

On the bias of batch Bellman residual minimisation

چکیده انگلیسی

This letter addresses the problem of Bellman residual minimisation in reinforcement learning for the model-free batch case. We prove the simple, but not necessarily obvious result, that no unbiased estimate of the Bellman residual exists for a single trajectory of observations. We further pick up the recent suggestion of Antos et al. [Learning near-optimal policies with Bellman-residual minimisation based fitted policy iteration and a single sample path, in: COLT, 2006, pp. 574–588] for approximative Bellman residual minimisation and discuss its properties concerning consistency, biasedness, and optimality. We finally give a suggestion to improve the optimality.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 72, Issues 7–9, March 2009, Pages 2005–2008

نویسندگان

Daniel Schneegass,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

On the bias of batch Bellman residual minimisation

دسترسی سریع

ارتباط

English Website