Combinatorial bandits

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
430284	687959	2012	19 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Online prediction - پیش بینی آنلاین

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

چکیده انگلیسی

We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S⊆RdS⊆Rd. At the same time, the opponent chooses a “loss” vector in RdRd and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible element in SS. We consider the “bandit” setting in which the forecaster only has access to the losses of the chosen vectors (i.e., the entire loss vectors are not observed). We introduce a variant of a strategy by Dani, Hayes and Kakade achieving a regret bound that, for a variety of concrete choices of SS, is of order ndln|S| where n is the time horizon. This is not improvable in general and is better than previously known bounds. The examples we consider are all such that S⊆{0,1}dS⊆{0,1}d, and we show how the combinatorial structure of these classes can be exploited to improve the regret bounds. We also point out computationally efficient implementations for various interesting choices of SS.

► Online linear optimization under bandit feedback.
► Focus on combinatorial decision spaces.
► GeometricHedge algorithm with uniform exploration.
► Regret bounds tight in several cases.
► Known efficient implementation in some cases.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computer and System Sciences - Volume 78, Issue 5, September 2012, Pages 1404–1422

نویسندگان

Nicolò Cesa-Bianchi, Gábor Lugosi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Combinatorial bandits

دسترسی سریع

ارتباط

English Website