کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
493275 721685 2012 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Inverted File Compression using EGC and FEGC
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Inverted File Compression using EGC and FEGC
چکیده انگلیسی

Inverted files are most commonly used technique for efficient query processing and fast text searching in Information Retrieval System (IRS). But the size of the inverted files is extremely large due to rapid growth in the size of the data in the information retrieval system. So as to reduce the index size and increase the accessing speed, compression techniques are used. In this paper, we propose a new integer compression technique called Fast Extended Golomb code (FEGC) based on Extended Golomb Code (EGC), to reduce the size as well as increasing the decoding speed of the inverted index. The decoding speed is very important to increase the speed of query processing in IRS applications. We have implemented and tested the performance of FEGC and EGC with other existing techniques. Experimental results show that the EGC compression techniques perform well and give better compression than other existing techniques. EGC is also relatively better than FEGC. But the number of CPU cycles required by EGC is more than that of FEGC for encoding and decoding an integer. Hence FEGC could be faster encoder than EGC while it gives comparable results with respect to EGC in compressing doc–ids for IRS applications.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Technology - Volume 6, 2012, Pages 493-500