Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
478689 | European Journal of Operational Research | 2010 | 8 Pages |
In this article, we aim to analyze the limitations of learning in automata-based systems by introducing the L+L+ algorithm to replicate quasi-perfect learning, i.e., a situation in which the learner can get the correct answer to any of his queries. This extreme assumption allows the generalization of any limitations of the learning algorithm to less sophisticated learning systems. We analyze the conditions under which the L+L+ infers the correct automaton and when it fails to do so. In the context of the repeated prisoners’ dilemma, we exemplify how the L+L+ may fail to learn the correct automaton. We prove that a sufficient condition for the L+L+ algorithm to learn the correct automaton is to use a large number of look-ahead steps. Finally, we show empirically, in the product differentiation problem, that the computational time of the L+L+ algorithm is polynomial on the number of states but exponential on the number of agents.