کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
430794 688153 2007 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Optimal spaced seeds for faster approximate string matching
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Optimal spaced seeds for faster approximate string matching
چکیده انگلیسی

Filtering is a standard technique for fast approximate string matching in practice. In filtering, a quick first step is used to rule out almost all positions of a text as possible starting positions for a pattern. Typically this step consists of finding the exact matches of small parts of the pattern. In the followup step, a slow method is used to verify or eliminate each remaining position. The running time of such a method depends largely on the quality of the filtering step, as measured by its false positives rate. The quality of such a method depends on the number of true matches that it misses, that is, on its false negative rate. A spaced seed is a recently introduced type of filter pattern that allows gaps (i.e. do not cares) in the small sub-pattern to be searched for. Spaced seeds promise to yield a much lower false positives rate, and thus have been extensively studied, though heretofore only heuristically or statistically. In this paper, we show how to design almost optimal spaced seeds that yield no false negatives.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computer and System Sciences - Volume 73, Issue 7, November 2007, Pages 1035-1044