دانلود رایگان مقاله: تجمع حالت افراطی فراتر از تصمیم گیری مارکوف

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4952463	1442035	2016	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Extreme state aggregation beyond Markov decision processes

ترجمه فارسی عنوان

تجمع حالت افراطی فراتر از تصمیم گیری مارکوف

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

State aggregation - تجمع دولت Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش مقاله

تجمع حالت افراطی فراتر از تصمیم گیری مارکوف

چکیده انگلیسی

We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp. MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Theoretical Computer Science - Volume 650, 18 October 2016, Pages 73-91

نویسندگان

Marcus Hutter,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : تجمع حالت افراطی فراتر از تصمیم گیری مارکوف

دسترسی سریع

ارتباط

English Website