Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
431317 | Journal of Discrete Algorithms | 2012 | 11 Pages |
The concept of a longest previous factor (LPF) is inherent to Ziv–Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. It is expressed in the form of a table — LPF[i]LPF[i] is the maximum length of a factor starting at position i, that also appears earlier in the given text. We show how to compute efficiently three new tables storing different variants of previous factors (past segments) of a string. The longest previous non-overlapping factor, for a given position i, is the longest factor starting at i which has an exact copy occurring entirely before, while the longest previous non-overlapping reverse factor for a given position i is the longest factor starting at i , such that its reverse copy occurs entirely before. In both problems the previous copies of the factors are required to occur within the prefix ending at position i−1i−1. The longest previous (possibly overlapping) reverse factor is the longest factor starting at i, such that its reverse copy starts before i.These problems have not been explicitly considered before, but they have several applications and they are natural extensions of the longest previous factor problem, which has been extensively studied. Moreover, the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.