کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
429164 687066 2008 7 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving the bit-parallel NFA of Baeza-Yates and Navarro for approximate string matching
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Improving the bit-parallel NFA of Baeza-Yates and Navarro for approximate string matching
چکیده انگلیسی

We propose a new variant of the bit-parallel NFA of Baeza-Yates and Navarro (BPD) for approximate string matching [R. Baeza-Yates, G. Navarro, Faster approximate string matching, Algorithmica 23 (1999) 127–158]. BPD is one of the most practical approximate string matching algorithms under moderate pattern lengths and error levels [G. Myers, A fast bit-vector algorithm for approximate string matching based on dynamic programming, J. ACM 46 (3) 1989 395–415; G. Navarro, M. Raffinot, Flexible Pattern Matching in Strings—Practical On-line Search Algorithms for Texts and Biological Sequences, Cambridge University Press, Cambridge, UK, 2002]. Given a length-m pattern and an error threshold k, the original BPD requires (m−k)(k+2) bits of space to represent an NFA with (m−k)(k+1) states. In this paper we remove redundancy from the original NFA representation. Our variant requires (m−k)(k+1) bits of space, which is optimal in the sense that exactly one bit per state is used. The space efficiency is achieved by using an alternative, but equally or even more efficient, simulation algorithm for the bit-parallel NFA. We also present experimental results to compare our modified NFA against the original BPD and its main competitors. Our new variant is more efficient than the original BPD, and it hence takes over/extends the role of the original BPD as one of the most practical approximate string matching algorithms under moderate values of k and m.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing Letters - Volume 108, Issue 5, 15 November 2008, Pages 313-319