The cognitive mechanisms of optimal sampling

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
2427003	1105937	2012	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Foraging - جستجوگری غذا یا تئوری تغذیه ای Two-armed bandit - راهزن دو مسلح Neural networks - شبکه های عصبی Matching law - مطابق قانون Optimal sampling - نمونه برداری بهینه Reinforcement learning - یادگیری تقویتی

موضوعات مرتبط

علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک علوم دامی و جانورشناسی

پیش نمایش صفحه اول مقاله

The cognitive mechanisms of optimal sampling

چکیده انگلیسی

How can animals learn the prey densities available in an environment that changes unpredictably from day to day, and how much effort should they devote to doing so, rather than exploiting what they already know? Using a two-armed bandit situation, we simulated several processes that might explain the trade-off between exploring and exploiting. They included an optimising model, dynamic backward sampling; a dynamic version of the matching law; the Rescorla–Wagner model; a neural network model; and ɛ-greedy and rule of thumb models derived from the study of reinforcement learning in artificial intelligence. Under conditions like those used in published studies of birds’ performance under two-armed bandit conditions, all models usually identified the more profitable source of reward, and did so more quickly when the reward probability differential was greater. Only the dynamic programming model switched from exploring to exploiting more quickly when available time in the situation was less. With sessions of equal length presented in blocks, a session-length effect was induced in some of the models by allowing motivational, but not memory, carry-over from one session to the next. The rule of thumb model was the most successful overall, though the neural network model also performed better than the remaining models.

► Behaviour of a number of learning models was simulated in a “two-armed bandit” situation.
► All models found the better arm more quickly when the payoff difference was greater.
► Only a backwards programming model absorbed on one arm sooner in shorter sessions.
► Some other models showed such an effect if motivation carried over between sessions.
► The most successful model used a rule of thumb specific to the precise situation.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Behavioural Processes - Volume 89, Issue 2, February 2012, Pages 77–85

نویسندگان

Stephen E.G. Lea, Ian P.L. McLaren, Susan M. Dow, Donald A. Graft,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

The cognitive mechanisms of optimal sampling

دسترسی سریع

ارتباط

English Website