Article ID Journal Published Year Pages File Type
431634 Journal of Discrete Algorithms 2015 12 Pages PDF
Abstract

We discuss an improved method of Variable-to-Fixed length code (VF code) encoding. A VF code is a coding scheme that splits an input text into a consecutive sequence of substrings, and then assigns a fixed length codeword to each substring. Since all the codewords have the same length, they are easily extracted, and thus, VF code is suitable for processing compressed texts directly. Almost Instantaneous VF code (AIVF code), which was proposed by Yamamoto and Yokoo in 2001, achieves a good compression ratio by using a set of parse trees. However, it requires more time and space for both encoding and decoding than does a classical VF code. In this paper, we prove that the set of parse trees of AIVF code can be multiplexed into a compact single tree and the original encoding and decoding procedures can be simulated using the compacted tree. We also give the upper and lower bounds of the number of nodes in the multiplexed parse tree, in addition to those of the number of nodes reduced from the original trees. The experimental results showed that the proposed method can encode natural language texts much faster than AIVF coding.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,