Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
436060 | Theoretical Computer Science | 2007 | 10 Pages |
The suffix array is a fundamental index data structure in string algorithms and bioinformatics, and the compressed suffix array (CSA) and the FM-index are its compressed versions. Many algorithms for constructing these index data structures have been developed. Recently, Hon et al. [W.K. Hon, K. Sadakane, W.K. Sung, Breaking a time-and-space barrier in constructing full-text indices, in: Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science, 2003, pp. 251–260] proposed a construction algorithm using O(nloglog|Σ|) time and O(nlog|Σ|)-bit working space, which is the fastest algorithm using O(nlog|Σ|)-bit working space.In this paper we give an efficient algorithm to construct the index data structures. Our algorithm constructs the suffix array, the CSA, the FM-index, and Burrows–Wheeler transform using alphabet-independent O(n) time and -bit working space, where α=log32. Our algorithm takes less time and more space than Hon et al.’s algorithm. Our algorithm uses least working space among alphabet-independent linear-time algorithms.