کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
429604 687607 2012 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A randomized Numerical Aligner (rNA)
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
A randomized Numerical Aligner (rNA)
چکیده انگلیسی

With the advent of new sequencing technologies able to produce an enormous quantity of short genomic sequences, new tools able to search for them inside a genomic reference sequence have emerged. Because of chemical reading errors or of the variability between organisms, one is interested in finding not only exact occurrences, but also occurrences with up to k mismatches. The contribution of this paper is twofold. On the one hand, we present a generalization of the classical Rabin–Karp string matching algorithm to solve the k  -mismatch problem, with average complexity O(n+m)O(n+m) (n text and m pattern lengths, respectively). On the other hand, we show how to employ this idea in conjunction with an index over the text, allowing to search a pattern, with up to k mismatches, in time proportional to its length. This novel tool—rNA (randomized Numerical Aligner)—is in general faster and more accurate than other available tools like SOAP2, BWA, and BOWTIE. rNA executables and source code are freely available at http://iga-rna.sourceforge.net/.


► Generalization of the Rabin–Karp string matching algorithm for the k-mismatch problem.
► Definition and study of Hamming-aware hash functions.
► Implementation and design of a short string aligner: rNA.
► Comparison and evaluation of rNA in the Next Generation Sequencing context.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computer and System Sciences - Volume 78, Issue 6, November 2012, Pages 1868–1882
نویسندگان
, , ,