Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4952116 | Theoretical Computer Science | 2017 | 18 Pages |
Abstract
Constructing the suffix array for a string collection is an important task that may be performed by sorting the concatenation of all strings. In this article we present algorithms gSAIS and gSACA-K, which extend SAIS (Nong et al., 2011) [8] and SACA-K (Nong, 2013) [10] to construct the suffix array for a string collection maintaining their theoretical bounds, respecting the order among all suffixes, and improving their practical performance. gSACA-K is an optimal suffix array construction algorithm for string collections from constant alphabets. Moreover, we show how to modify gSAIS and gSACA-K to also compute the longest common prefix array and the document array as a byproduct of suffix sorting, with the same theoretical bounds. We performed experiments that showed that our algorithms have an outstanding practical performance.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Felipe A. Louza, Simon Gog, Guilherme P. Telles,