Article ID Journal Published Year Pages File Type
436442 Theoretical Computer Science 2014 13 Pages PDF
Abstract

•V-order Burrows–Wheeler type data clustering is introduced.•The multi-word V-order Burrows–Wheeler Transform is proposed.•The construction of a non-lexicographic suffix array is given.•Linear V-order sorting of the conjugates of a word is achieved.

In this paper we introduce the V-transform (V-BWT), a variant of the classic Burrows–Wheeler Transform. The original BWT uses lexicographic order, whereas we apply a distinct total ordering of strings called V-order. V  -order string comparison and Lyndon-like factorization of a string x=x[1..n]x=x[1..n] into V-words have recently been shown to be linear in their use of time and space (Daykin et al., 2011) [18]. Here we apply these subcomputations, along with Θ(n)Θ(n) suffix-sorting (Ko and Aluru, 2003) [26], to implement linear V-sorting of all the rotations of a string. When it is known that the input string x is a V-word, we compute the V  -transform in Θ(n)Θ(n) time and space, and also outline an efficient algorithm for inverting the V-transform and recovering x. We further outline a bijective algorithm in the case that x is arbitrary. We propose future research into other variants of transforms using lex-extension orderings (Daykin et al., 2013) [19]. Motivation for this work arises in possible applications to data compression.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,