The lagging anchor model for game learning-a solution to the Crawford puzzle

Article ID	Journal	Published Year	Pages	File Type
10437781	Journal of Economic Behavior & Organization	2005	17 Pages	PDF

Abstract

In matrix games with fully mixed solutions, simultaneous gradient ascent by both players does not converge, a fact known as the Crawford puzzle. We suggest the lagging anchor learning model, which we prove to give convergence, as a solution to this puzzle. Our learning model can be viewed as a reinforcement learning process where the players perform relatively little computation. We compare our learning model with other published solutions to the puzzle. We also prove a generalization of the Crawford puzzle by identifying a broad class of learning rules that cannot produce exponential stability of solution points.

Keywords

Convergence Reinforcement learning Learning in games