کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
434080 689678 2015 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On hardness of several string indexing problems
ترجمه فارسی عنوان
در سختی چندین مشکل نشان دادن رشته
کلمات کلیدی
بازیابی سند، ساختارهای داده، رشته جستجو، مرزهای پایین، ضرب ماتریس بولی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی

Let D={d1,d2,…,dD}D={d1,d2,…,dD} be a collection of D string documents of n   characters in total. The two-pattern matching problems ask to index DD for answering the following queries efficiently.
• Report/count the unique documents containing P1P1and  P2P2.
• Report/count the unique documents containing P1P1, but not  P2P2.Here P1P1 and P2P2 represent input patterns of length p1p1 and p2p2 respectively. Linear space data structures with O(p1+p2+nklogO(1)⁡n) query cost are already known for the reporting version, where k represents the output size. For the counting version (i.e., report the value k  ), a simple linear-space index with O(p1+p2+n) query cost can be constructed in O(n3/2)O(n3/2) time. However, it is still not known if these are the best possible bounds for these problems. In this paper, we show a strong connection between these string indexing problems and the boolean matrix multiplication problem. Based on this, we argue that these results cannot be improved significantly using purely combinatorial techniques. We also provide an improved upper bound for a related problem known as common colors query problem.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Theoretical Computer Science - Volume 582, 31 May 2015, Pages 74–82
نویسندگان
, , , ,