Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4942161 | Artificial Intelligence | 2016 | 27 Pages |
Abstract
Finally, the bandit model considers a probably more realistic and prevalent setting with only partial information, in which at each time step each player only knows the cost of her own currently played strategy, but not any costs of unplayed strategies. For the class of atomic congestion games, we propose a family of bandit algorithms based on the mirror-descent algorithms previously presented, and show that when each player individually adopts such a bandit algorithm, their joint (mixed) strategy profile quickly converges with implications.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Po-An Chen, Chi-Jen Lu,