| Article ID | Journal | Published Year | Pages | File Type |
|---|---|---|---|---|
| 435561 | Theoretical Computer Science | 2016 | 14 Pages |
Pattern matching between traditional strings is well-defined for both uncompressed and compressed sequences. Prior to this work, parameterized pattern matching (p-matching) was defined predominately by the matching between uncompressed parameterized strings (p-strings) from the constant alphabet Σ and the parameter alphabet Π. In this work, we define the compressed parameterized pattern matching (compressed p-matching) problem to find all of the p-matches between a pattern P and text T, using only P and the compressed text TcTc. Initially, we present parameterized compression (p-compression) as a new way to losslessly compress data. Experimentally, we show that p-compression is competitive with various other standard compression schemes. Subsequently, we provide the compression and decompression algorithms. Next, two different approaches are developed to address the compressed p-matching problem: (1) using the recently proposed parameterized arithmetic codes (pAC) and (2) using the parameterized border array (p-border). Our general solution is independent of the underlying compression scheme. The results are further examined for catenate, Tunstall codes, Huffman codes, and LZSS.
