Article ID Journal Published Year Pages File Type
478689 European Journal of Operational Research 2010 8 Pages PDF
Abstract

In this article, we aim to analyze the limitations of learning in automata-based systems by introducing the L+L+ algorithm to replicate quasi-perfect learning, i.e., a situation in which the learner can get the correct answer to any of his queries. This extreme assumption allows the generalization of any limitations of the learning algorithm to less sophisticated learning systems. We analyze the conditions under which the L+L+ infers the correct automaton and when it fails to do so. In the context of the repeated prisoners’ dilemma, we exemplify how the L+L+ may fail to learn the correct automaton. We prove that a sufficient condition for the L+L+ algorithm to learn the correct automaton is to use a large number of look-ahead steps. Finally, we show empirically, in the product differentiation problem, that the computational time of the L+L+ algorithm is polynomial on the number of states but exponential on the number of agents.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
,