Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
9953303 | Operations Research Letters | 2018 | 4 Pages |
Abstract
In this short note we consider a dynamic assortment planning problem under the capacitated multinomial logit (MNL) bandit model. We prove a tight lower bound on the accumulated regret that matches existing regret upper bounds for all parameters (time horizon T, number of items N
and maximum assortment capacity K) up to logarithmic factors. Our results close an O(K) gap between upper and lower regret bounds from existing works.
Keywords
Related Topics
Physical Sciences and Engineering
Mathematics
Discrete Mathematics and Combinatorics
Authors
Xi Chen, Yining Wang,