دانلود رایگان مقاله: مکانیزم های کنترلی سیگنال های کنتراست کشنده برای انتخاب فعالیت مبتنی بر ارزش: اجرای الگوریتم های یادگیری تقویت و فراتر از آن

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
6255896	1612922	2016	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

ReviewCorticostriatal circuit mechanisms of value-based action selection: Implementation of reinforcement learning algorithms and beyond

ترجمه فارسی عنوان

مکانیزم های کنترلی سیگنال های کنتراست کشنده برای انتخاب فعالیت مبتنی بر ارزش: اجرای الگوریتم های یادگیری تقویت و فراتر از آن

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Corticostriatal Action selection - انتخاب عمل Winner-take-all - برنده همه چیز Reward prediction error - خطای پیش بینی پاداش Nonlinear Dynamics - دینامیک غیرخطی Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

علوم زیستی و بیوفناوری علم عصب شناسی علوم اعصاب رفتاری

پیش نمایش مقاله

مکانیزم های کنترلی سیگنال های کنتراست کشنده برای انتخاب فعالیت مبتنی بر ارزش: اجرای الگوریتم های یادگیری تقویت و فراتر از آن

چکیده انگلیسی

- Striatal circuit dynamics may be seen as fragments of short “Winner-Take-All (WTA)”.
- Cortical recurrent excitation may enable 'soft-max' WTA and retention of information.
- Striatal “max” circuit and cortical “soft-max” circuit might co-implement Q-learning.
- Cortical retention of executed action might serve for prediction-error calculation.
- Roles of the suggested complex circuit dynamics in long time scales remain open.

Value-based action selection has been suggested to be realized in the corticostriatal local circuits through competition among neural populations. In this article, we review theoretical and experimental studies that have constructed and verified this notion, and provide new perspectives on how the local-circuit selection mechanisms implement reinforcement learning (RL) algorithms and computations beyond them. The striatal neurons are mostly inhibitory, and lateral inhibition among them has been classically proposed to realize “Winner-Take-All (WTA)” selection of the maximum-valued action (i.e., 'max' operation). Although this view has been challenged by the revealed weakness, sparseness, and asymmetry of lateral inhibition, which suggest more complex dynamics, WTA-like competition could still occur on short time scales. Unlike the striatal circuit, the cortical circuit contains recurrent excitation, which may enable retention or temporal integration of information and probabilistic “soft-max” selection. The striatal “max” circuit and the cortical “soft-max” circuit might co-implement an RL algorithm called Q-learning; the cortical circuit might also similarly serve for other algorithms such as SARSA. In these implementations, the cortical circuit presumably sustains activity representing the executed action, which negatively impacts dopamine neurons so that they can calculate reward-prediction-error. Regarding the suggested more complex dynamics of striatal, as well as cortical, circuits on long time scales, which could be viewed as a sequence of short WTA fragments, computational roles remain open: such a sequence might represent (1) sequential state-action-state transitions, constituting replay or simulation of the internal model, (2) a single state/action by the whole trajectory, or (3) probabilistic sampling of state/action.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Behavioural Brain Research - Volume 311, 15 September 2016, Pages 110-121

نویسندگان

Kenji Morita, Jenia Jitsev, Abigail Morrison,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : مکانیزم های کنترلی سیگنال های کنتراست کشنده برای انتخاب فعالیت مبتنی بر ارزش: اجرای الگوریتم های یادگیری تقویت و فراتر از آن

دسترسی سریع

ارتباط

English Website