Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5071599 | Games and Economic Behavior | 2015 | 22 Pages |
Abstract
We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales optimally as O(1T) with the number T of rounds. However, when our algorithm is used by both players of a zero-sum game, their average regret scales as O(lnâ¡TT), guaranteeing a near-linear rate of convergence to the value of the game. This represents an almost-quadratic improvement on the rate of convergence to the value of a zero-sum game known to be achievable by any no-regret learning algorithm. Moreover, it is essentially optimal as we also show a lower bound of Ω(1T) for all distributed dynamics, as long as the players do not know their payoff matrices in the beginning of the dynamics. (If they did, they could privately compute minimax strategies and play them ad infinitum.)
Related Topics
Social Sciences and Humanities
Economics, Econometrics and Finance
Economics and Econometrics
Authors
Constantinos Daskalakis, Alan Deckelbaum, Anthony Kim,