Article ID Journal Published Year Pages File Type
449338 AEU - International Journal of Electronics and Communications 2011 9 Pages PDF
Abstract

We propose an efficient, sub-optimal prefix code construction method for discrete sources with a finite alphabet and known probability mass function (pmf). It is well known that for a source that puts out symbols xixi with probability pipi, the optimal codeword lengths are li=log(1/pi)li=log(1/pi). However, codeword lengths are integers and log(1/pi)log(1/pi) is, in general, not an integer. We propose a method to find binary codewords for xixi whose lengths are initially assumed to be ⌈log(1/pi)⌉−1⌈log(1/pi)⌉−1. Every prefix code must satisfy the Kraft's inequality but our initial codeword lengths may not satisfy the Kraft's inequality. Using a simplified version of the subset sum problem we find a minimal set of codeword lengths that must be increased from ⌈log(1/pi)⌉−1⌈log(1/pi)⌉−1 to ⌈log(1/pi)⌉⌈log(1/pi)⌉, so that Kraft's inequality is satisfied. Even though this solution is not optimal it leads to average codeword lengths that are close to optimal and in some cases codeword lengths that are optimal. Unlike the Huffman code, our solution does not require the ordering of probabilities in the pmf. The efficiency of our method can be further improved by reducing the size of the subset sum problem. The example of English text shows that our method leads to a solution that is very close to the optimal solution. The proposed method can also be used for encryption, thereby accomplishing both compression and encryption simultaneously.

Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, ,