Approximation spaces in off-policy Monte Carlo learning

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
381576	1437504	2007	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Expectation - انتظار Monte Carlo method - روش مونت کارلو Approximation space - فاصله تقریبی Rough sets - مجموعه های راف یا مجموعه های دقیق

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Approximation spaces in off-policy Monte Carlo learning

چکیده انگلیسی

This paper introduces an approach to off-policy Monte Carlo (MC) learning guided by behaviour patterns gleaned from approximation spaces and rough set theory introduced by Zdzisław Pawlak in 1981. During reinforcement learning, an agent makes action selections in an effort to maximize a reward signal obtained from the environment. The problem considered in this paper is how to estimate the expected value of cumulative future discounted rewards in evaluating agent actions during reinforcement learning. The solution to this problem results from a form of weighted sampling using a combination of MC methods and approximation spaces to estimate the expected value of returns on actions. This is made possible by considering behaviour patterns of an agent in the context of approximation spaces. The framework provided by an approximation space makes it possible to measure the degree that agent behaviours are a part of (“covered by”) a set of accepted agent behaviours that serve as a behaviour evaluation norm. Furthermore, this article introduces an adaptive action control strategy called run-and-twiddle (RT) (a form of adaptive learning introduced by Oliver Selfridge in 1984), where approximate spaces are constructed on a “need by need” basis. Finally, a monocular vision system has been selected to facilitate the evaluation of the reinforcement learning methods. The goal of the vision system is to track a moving object, and rewards are based on the proximity of the object to the centre of the camera field of view. The contribution of this article is the introduction of a RT form of off-policy MC learning.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Engineering Applications of Artificial Intelligence - Volume 20, Issue 5, August 2007, Pages 667–675

نویسندگان

James F. Peters, Christopher Henry,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Approximation spaces in off-policy Monte Carlo learning

دسترسی سریع

ارتباط

English Website