کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6874768 688480 2016 22 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An improved algorithm for the all-pairs suffix-prefix problem
ترجمه فارسی عنوان
یک الگوریتم بهبود یافته برای تمام پس زمینه پیشوند مشکل
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی
Finding all longest suffix-prefix matches for a collection of strings is known as the all pairs suffix-prefix match problem and its main application is de novo genome assembly. This problem is well studied in stringology and has been solved optimally in 1992 by Gusfield et al. [8] using suffix trees. In 2010, Ohlebusch and Gog [13] proposed an alternative solution based on enhanced suffix arrays which has also optimal time complexity but is faster in practice. In this article, we present another optimal algorithm based on enhanced suffix arrays which further improves the practical running time. Our new solution solves the problem locally for each string, scanning the enhanced suffix array backwards to avoid the processing of suffixes that are no suffix-prefix matching candidates. In an empirical evaluation we observed that the new algorithm is over two times faster and more space-efficient than the method proposed by Ohlebusch and Gog. When compared to Readjoiner [5], a good practical solution, our algorithm is faster for small overlap lengths, in pace with theoretical optimality.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Discrete Algorithms - Volume 37, March 2016, Pages 34-43
نویسندگان
, , , ,