Article ID Journal Published Year Pages File Type
435987 Theoretical Computer Science 2015 12 Pages PDF
Abstract

We address a variant of the dictionary matching problem where the dictionary is represented by a straight line program (SLP). For a given SLP-compressed dictionary DD of size n and height h representing m patterns of total length N  , we present an O(n2log⁡N)O(n2log⁡N)-size representation of Aho–Corasick automaton which recognizes all occurrences of the patterns in DD in amortized O(h+m)O(h+m) running time per character. We also propose an algorithm to construct this compressed Aho–Corasick automaton in O(n3log⁡nlog⁡N)O(n3log⁡nlog⁡N) time and O(n2log⁡N)O(n2log⁡N) space. In a spacial case where DD represents only a single pattern, we present an O(nlog⁡N)O(nlog⁡N)-size representation of the Morris–Pratt automaton which permits us to find all occurrences of the pattern in amortized O(h)O(h) running time per character, and we show how to construct this representation in O(n3log⁡nlog⁡N)O(n3log⁡nlog⁡N) time with O(n2log⁡N)O(n2log⁡N) working space.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , , ,