A new distribution vector and its application in genome clustering

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
2834219	1164299	2011	6 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Clustering - خوشه بندی Phylogenetic tree - درخت فیلوژنتیک

موضوعات مرتبط

علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی

پیش نمایش صفحه اول مقاله

A new distribution vector and its application in genome clustering

چکیده انگلیسی

In this paper we report a novel mathematical method to transform the DNA sequences into the distribution vectors which correspond to points in the sixty dimensional space. Each component of the distribution vector represents the distribution of one kind of nucleotide in k segments of the DNA sequences. The mathematical and statistical properties of the distribution vectors are demonstrated and examined with huge datasets of human DNA sequences and random sequences. The determined expectation and standard deviation can make the mapping stable and practicable. Moreover, we apply the distribution vectors to the clustering of the Haemagglutinin (HA) gene of 60 H1N1 viruses from Human, Swine and Avian, the complete mitochondrial genomes from 80 placental mammals and the complete genomes from 50 bacteria. The 60 H1N1 viruses, 80 placental mammals and 50 bacteria are classified accurately and rapidly compared to the multiple sequence alignment methods. The results indicate that the distribution vectors can reveal the similarity and evolutionary relationship among homologous DNA sequences based on the distances between any two of these distribution vectors. The advantage of fast computation offers the distribution vectors the opportunity to deal with a huge amount of DNA sequences efficiently.

Figure optionsDownload as PowerPoint slideHighlights
► The expectation and std of our vectors do not depend on the length of the sequence.
► Our method can cluster homologous DNA sequences correctly.
► The distribution vector method is much faster than the MSA methods.
► Our method can discover the functionality or the evolution of the new sequence.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Molecular Phylogenetics and Evolution - Volume 59, Issue 2, May 2011, Pages 438–443

نویسندگان

Bo Zhao, Rong L. He, Stephen S.-T. Yau,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A new distribution vector and its application in genome clustering

دسترسی سریع

ارتباط

English Website