New algorithms for binary jumbled pattern matching

Article ID	Journal	Published Year	Pages	File Type
427483	Information Processing Letters	2013	5 Pages	PDF

Abstract

•Two new algorithms for binary jumbled pattern matching.•We present the first solution for this problem based on word-level parallelism.•A variant of the RLE based index with better query time at the price of extra space.

Given a pattern P and a text T, both strings over a binary alphabet, the binary jumbled string matching problem consists in telling whether any permutation of P occurs in T . The indexed version of this problem, i.e., preprocessing a string to efficiently answer such permutation queries, is hard and has been studied in the last few years. Currently the best bounds for this problem are O(n2/log2n) (with O(n)O(n) space and O(1)O(1) query time) (Moosa and Rahman (2012) [1]) and O(r2logr) (with O(|L|)O(|L|) space and O(log|L|)O(log|L|) query time) (Badkobeh et al. (2012) [2]), where r is the length of the run-length encoding of T and |L|=O(n)|L|=O(n) is the size of the index. In this paper we present new results for this problem. Our first result is an alternative construction of the index by Badkobeh et al. (2012) [2] that obtains a trade-off between the space and the time complexity. It has O(r2logk+n/k) complexity to build the index, O(logk) query time, and uses O(n/k+|L|)O(n/k+|L|) space, where k is a parameter. The second result is an O(n2log2w/w) algorithm (with O(n)O(n) space and O(1)O(1) query time), based on word-level parallelism where w is the word size in bits.

Keywords

Word-level parallelism String algorithms Combinatorial problems Run-length encoding