کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
493275 | 721685 | 2012 | 8 صفحه PDF | دانلود رایگان |

Inverted files are most commonly used technique for efficient query processing and fast text searching in Information Retrieval System (IRS). But the size of the inverted files is extremely large due to rapid growth in the size of the data in the information retrieval system. So as to reduce the index size and increase the accessing speed, compression techniques are used. In this paper, we propose a new integer compression technique called Fast Extended Golomb code (FEGC) based on Extended Golomb Code (EGC), to reduce the size as well as increasing the decoding speed of the inverted index. The decoding speed is very important to increase the speed of query processing in IRS applications. We have implemented and tested the performance of FEGC and EGC with other existing techniques. Experimental results show that the EGC compression techniques perform well and give better compression than other existing techniques. EGC is also relatively better than FEGC. But the number of CPU cycles required by EGC is more than that of FEGC for encoding and decoding an integer. Hence FEGC could be faster encoder than EGC while it gives comparable results with respect to EGC in compressing doc–ids for IRS applications.
Journal: Procedia Technology - Volume 6, 2012, Pages 493-500