Article ID Journal Published Year Pages File Type
431317 Journal of Discrete Algorithms 2012 11 Pages PDF
Abstract

The concept of a longest previous factor (LPF) is inherent to Ziv–Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. It is expressed in the form of a table — LPF[i]LPF[i] is the maximum length of a factor starting at position i, that also appears earlier in the given text. We show how to compute efficiently three new tables storing different variants of previous factors (past segments) of a string. The longest previous non-overlapping factor, for a given position i, is the longest factor starting at i which has an exact copy occurring entirely before, while the longest previous non-overlapping reverse factor for a given position i is the longest factor starting at i  , such that its reverse copy occurs entirely before. In both problems the previous copies of the factors are required to occur within the prefix ending at position i−1i−1. The longest previous (possibly overlapping) reverse factor is the longest factor starting at i, such that its reverse copy starts before i.These problems have not been explicitly considered before, but they have several applications and they are natural extensions of the longest previous factor problem, which has been extensively studied. Moreover, the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , , , ,