Empirical distribution of k-word matches in biological sequences

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
531420	869839	2009	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

biological sequences - توالی بیولوژیکی Genomic data - داده های ژنومی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Empirical distribution of k-word matches in biological sequences

چکیده انگلیسی

This study focuses on an alignment-free sequence comparison method: the number of words of length k shared between two sequences, also known as the D2D2 statistic. The advantages of the use of this statistic over alignment-based methods are firstly that it does not assume that homologous segments are contiguous, and secondly that the algorithm is computationally extremely fast, the runtime being proportional to the size of the sequence under scrutiny. Existing applications of the D2D2 statistic include the clustering of related sequences in large EST databases such as the STACK database. Such applications have typically relied on heuristics without any statistical basis. Rigorous statistical characterisations of the distribution of D2D2 have subsequently been undertaken, but have focussed on the distribution's asymptotic behaviour, leaving the distribution of D2D2 uncharacterised for most practical cases. The work presented here bridges these two worlds to give usable approximations of the distribution of D2D2 for ranges of parameters most frequently encountered in the study of biological sequences.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 42, Issue 4, April 2009, Pages 539–548

نویسندگان

Sylvain Forêt, Susan R. Wilson, Conrad J. Burden,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Empirical distribution of k-word matches in biological sequences

دسترسی سریع

ارتباط

English Website