کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515047 866940 2011 20 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Interpolative coding of integer sequences supporting log-time random access
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Interpolative coding of integer sequences supporting log-time random access
چکیده انگلیسی

Sequences of integers are common data types, occurring either as primary data or ancillary structures. The sizes of sequences can be large, making compression an interesting option. Effective compression presupposes variable-length coding, which destroys the regular alignment of values. Yet it would often be desirable to access only a small subset of the entries, either by position (ordinal number) or by content (element value), without having to decode most of the sequence from the start. Here such a random access technique for compressed integers is described, with the special feature that no auxiliary index is needed. The solution applies a method called interpolative coding, which is one of the most efficient non-statistical codes for integers. Indexing is avoided by address calculation guaranteeing sufficient space for codes even in the worst case. The additional redundancy, compared to regular interpolative coding, is only about 1 bit per source integer for uniform distribution. The time complexity of random access is logarithmic with respect to the source size for both position-based and content-based retrieval. According to experiments, random access is faster than full decoding when the number of accessed integers is not more than approximately 0.75 · n/log2n for sequence length n. The tests also confirm that the method is quite competitive with other approaches to random access coding, suggested in the literature.

Research highlights
► The new coding scheme offers random access to variable-length binary codewords.
► Random access is accomplished by address calculation, without any index.
► The compression power is comparable to other non-statistical codes for integers.
► Access is supported by position number and, for ascending sequences, by content.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 47, Issue 5, September 2011, Pages 742–761
نویسندگان
,