Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
435391 | Theoretical Computer Science | 2009 | 14 Pages |
Abstract
We introduce a new notion of motifs, called masks, that succinctly represents the repeated patterns for an input sequence T of n symbols drawn from an alphabet Σ. We show how to build the set of all frequent maximal masks of length L in O(2Ln) time and space in the worst case, using the Karp–Miller–Rosenberg approach. We analytically show that our algorithm performs better than the method based on constant-time enumerating and checking all the potential (|Σ|+1)L candidate patterns in T, after a polynomial-time preprocessing of T. Our algorithm is also cache-friendly, attaining block transfers, where sort(n) is the cache complexity of sorting n items.
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics