کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
434357 689720 2014 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Speeding up the detection of tandem repeats over the edit distance
ترجمه فارسی عنوان
تسریع تشخیص تکرار تاندوم در فاصله ویرایش
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی

A tandem repeat in a string is a sequence of two or more contiguous, approximate copies of a pattern. Finding an efficient, deterministic algorithm to perform an exhaustive search for tandem repeats in a string is an important and practical problem, due to the relevance of tandem repeats in several areas, including human identity testing, disease diagnosis, sequence homology, and population studies.In this paper, we present an O(nklog2k+Occ) algorithm to find all approximate tandem repeats within a sequence of length n, allowing at most k insertions, deletions and mismatches in each repeat. Our algorithm utilizes the Lempel–Ziv factorization which was previously used in algorithms that locate exact tandem repeats, and algorithms that locate tandem repeats with only mismatches. The LZ framework is combined with speedups for calculating the edit distance, achieving a new and efficient exhaustive search for finding tandem repeats in a string.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Theoretical Computer Science - Volume 525, 13 March 2014, Pages 103-110