کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
449338 1443201 2011 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Sub-optimal data compression and the subset sum problem
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Sub-optimal data compression and the subset sum problem
چکیده انگلیسی

We propose an efficient, sub-optimal prefix code construction method for discrete sources with a finite alphabet and known probability mass function (pmf). It is well known that for a source that puts out symbols xixi with probability pipi, the optimal codeword lengths are li=log(1/pi)li=log(1/pi). However, codeword lengths are integers and log(1/pi)log(1/pi) is, in general, not an integer. We propose a method to find binary codewords for xixi whose lengths are initially assumed to be ⌈log(1/pi)⌉−1⌈log(1/pi)⌉−1. Every prefix code must satisfy the Kraft's inequality but our initial codeword lengths may not satisfy the Kraft's inequality. Using a simplified version of the subset sum problem we find a minimal set of codeword lengths that must be increased from ⌈log(1/pi)⌉−1⌈log(1/pi)⌉−1 to ⌈log(1/pi)⌉⌈log(1/pi)⌉, so that Kraft's inequality is satisfied. Even though this solution is not optimal it leads to average codeword lengths that are close to optimal and in some cases codeword lengths that are optimal. Unlike the Huffman code, our solution does not require the ordering of probabilities in the pmf. The efficiency of our method can be further improved by reducing the size of the subset sum problem. The example of English text shows that our method leads to a solution that is very close to the optimal solution. The proposed method can also be used for encryption, thereby accomplishing both compression and encryption simultaneously.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: AEU - International Journal of Electronics and Communications - Volume 65, Issue 1, January 2011, Pages 53–61
نویسندگان
, ,