Article ID Journal Published Year Pages File Type
9953303 Operations Research Letters 2018 4 Pages PDF
Abstract
In this short note we consider a dynamic assortment planning problem under the capacitated multinomial logit (MNL) bandit model. We prove a tight lower bound on the accumulated regret that matches existing regret upper bounds for all parameters (time horizon T, number of items N and maximum assortment capacity K) up to logarithmic factors. Our results close an O(K) gap between upper and lower regret bounds from existing works.
Related Topics
Physical Sciences and Engineering Mathematics Discrete Mathematics and Combinatorics
Authors
, ,