Article ID Journal Published Year Pages File Type
426917 Information and Computation 2007 10 Pages PDF
Abstract

The edit distance (or Levenshtein distance) between two words is the smallest number of substitutions, insertions, and deletions of symbols that can be used to transform one of the words into the other. In this paper, we consider the problem of computing the edit distance of a regular language (the set of words accepted by a given finite automaton). This quantity is the smallest edit distance between any pair of distinct words of the language. We show that the problem is of polynomial time complexity. In particular, for a given finite automaton A with n transitions, over an alphabet of r symbols, our algorithm operates in time O(n2r2q2(q+r)), where q is either the diameter of A (if A is deterministic), or the square of the number of states in A (if A is nondeterministic). Incidentally, we also obtain an upper bound on the edit distance of a regular language in terms of the automaton accepting the language.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics