Smaller representation of finite state automata

Article ID	Journal	Published Year	Pages	File Type
434794	Theoretical Computer Science	2012	12 Pages	PDF

Abstract

This paper is a follow-up to Jan Daciuk’s experiments on space-efficient finite state automata representation that can be used directly for traversals in main memory (Daciuk, 2000) [4], . We investigate several techniques for reducing the memory footprint of minimal automata, mainly exploiting the fact that transition labels and transition pointer offset values are not evenly distributed and so are suitable for compression. We achieve a size gain of around 20%–30% compared to the original representation given in [4], . This result is comparable to the state-of-the-art dictionary compression techniques like the LZ-trie (Ristov and Laporte, 1999) [15] method, but remains memory and CPU efficient during construction.