Article ID Journal Published Year Pages File Type
4950956 Information Processing Letters 2016 7 Pages PDF
Abstract
Glushkov's nondeterministic finite automaton leads to efficient regular expression matching. But it is memory greedy for long regular expressions. We develop space efficient representations for the deterministic finite automata obtained from Glushkov automata. The approach reduces the space of the DFA from O(m2m) bits to O(m2k) bits where k is the number of strings in the regular expression, m is the number of characters excluding operators. The average space usage is O(m(1+1/σ)k) bits where σ is the size of the alphabet. The state transition function runs in constant time when the length of a word is not less than m. Experiments show that our method is as efficient as the previous method and uses less space.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,