کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
461869 696642 2011 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Scalable, statistical storage allocation for extensible inverted file construction
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Scalable, statistical storage allocation for extensible inverted file construction
چکیده انگلیسی

An Inverted file is a commonly used index for both archival databases and free text where no updates are expected. Applications like information filtering and dynamic environments like the Internet require inverted files to be updated efficiently. Recently, extensible inverted files are proposed which can be used for fast online indexing. The effective storage allocation scheme for such inverted files uses the arrival rate to preallocate storage. In this article, this storage allocation scheme is improved by using information about both the arrival rates and their variability to predict the storage needed, as well as scaling the storage allocation by a logarithmic factor. The resultant, final storage utilization rate can be as high as 97–98% after indexing about 1.6 million documents. This compares favorably with the storage utilization rate of the original arrival rate storage allocation scheme. Our evaluation shows that the retrieval time for extensible inverted file on solid state disk is on average similar to the retrieval time for in-memory extensible inverted file. When file seek time is not an issue, our scalable storage allocation enables extensible inverted files to be used as the main index on disk. Our statistical storage allocation may be applicable to novel situations where the arrival of items follows a binomial, Poisson or normal distribution.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 84, Issue 7, July 2011, Pages 1082–1088
نویسندگان
,