کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
435561 | 689915 | 2016 | 14 صفحه PDF | دانلود رایگان |
Pattern matching between traditional strings is well-defined for both uncompressed and compressed sequences. Prior to this work, parameterized pattern matching (p-matching) was defined predominately by the matching between uncompressed parameterized strings (p-strings) from the constant alphabet Σ and the parameter alphabet Π. In this work, we define the compressed parameterized pattern matching (compressed p-matching) problem to find all of the p-matches between a pattern P and text T, using only P and the compressed text TcTc. Initially, we present parameterized compression (p-compression) as a new way to losslessly compress data. Experimentally, we show that p-compression is competitive with various other standard compression schemes. Subsequently, we provide the compression and decompression algorithms. Next, two different approaches are developed to address the compressed p-matching problem: (1) using the recently proposed parameterized arithmetic codes (pAC) and (2) using the parameterized border array (p-border). Our general solution is independent of the underlying compression scheme. The results are further examined for catenate, Tunstall codes, Huffman codes, and LZSS.
Journal: Theoretical Computer Science - Volume 609, Part 1, 4 January 2016, Pages 129–142