| Article ID | Journal | Published Year | Pages | File Type | 
|---|---|---|---|---|
| 5071599 | Games and Economic Behavior | 2015 | 22 Pages | 
Abstract
												We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales optimally as O(1T) with the number T of rounds. However, when our algorithm is used by both players of a zero-sum game, their average regret scales as O(lnâ¡TT), guaranteeing a near-linear rate of convergence to the value of the game. This represents an almost-quadratic improvement on the rate of convergence to the value of a zero-sum game known to be achievable by any no-regret learning algorithm. Moreover, it is essentially optimal as we also show a lower bound of Ω(1T) for all distributed dynamics, as long as the players do not know their payoff matrices in the beginning of the dynamics. (If they did, they could privately compute minimax strategies and play them ad infinitum.)
											Related Topics
												
													Social Sciences and Humanities
													Economics, Econometrics and Finance
													Economics and Econometrics
												
											Authors
												Constantinos Daskalakis, Alan Deckelbaum, Anthony Kim, 
											