کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4947223 1439569 2017 24 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Finite budget analysis of multi-armed bandit problems
ترجمه فارسی عنوان
تجزیه و تحلیل بودجه محدود مشکلات راهزنی چند مسلح
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
In the budgeted multi-armed bandit (MAB) problem, a player receives a random reward and needs to pay a cost after pulling an arm, and he cannot pull any more arm after running out of budget. In this paper, we give an extensive study of the upper confidence bound based algorithms and a greedy algorithm for budgeted MABs. We perform theoretical analysis on the proposed algorithms, and show that they all enjoy sublinear regret bounds with respect to the budget B. Furthermore, by carefully choosing the parameters, they can even achieve log linear regret bounds. We also prove that the asymptotic lower bound for budgeted Bernoulli bandits is Ω(ln B). Our proof technique can be used to improve the theoretical results for fractional KUBE [26] and Budgeted Thompson Sampling [30].
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 258, 4 October 2017, Pages 13-29
نویسندگان
, , , , , , ,