کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
430890 688224 2013 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improved compressed indexes for full-text document retrieval
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Improved compressed indexes for full-text document retrieval
چکیده انگلیسی

We give new space/time tradeoffs for compressed indexes that answer document retrieval queries on general sequences. On a collection of D documents of total length n  , current approaches require at least |CSA|+O(nlgDlglgD) or 2|CSA|+o(n)2|CSA|+o(n) bits of space, where CSACSA is a full-text index. Using monotone minimal perfect hash functions (mmphfs), we give new algorithms for document listing with frequencies and top-k   document retrieval using just |CSA|+O(nlglglgD) bits. We also improve current solutions that use 2|CSA|+o(n)2|CSA|+o(n) bits, and consider other problems such as colored range listing, top-k most important documents, and computing arbitrary frequencies. We give proof-of-concept experimental results that show that using mmphfs may provide relevant practical tradeoffs for document listing with frequencies.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Discrete Algorithms - Volume 18, January 2013, Pages 3–13
نویسندگان
, , ,