n-Gram-based classification and unsupervised hierarchical clustering of genome sequences

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
469969	698375	2006	17 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

n-Gram - N-Gram Genome sequence - توالی ژنوم Hierarchical clustering - خوشه بندی سلسله مراتبی Classification - طبقه بندی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)

پیش نمایش صفحه اول مقاله

n-Gram-based classification and unsupervised hierarchical clustering of genome sequences

چکیده انگلیسی

In this paper we address the problem of automated classification of isolates, i.e., the problem of determining the family of genomes to which a given genome belongs. Additionally, we address the problem of automated unsupervised hierarchical clustering of isolates according only to their statistical substring properties. For both of these problems we present novel algorithms based on nucleotide n-grams, with no required preprocessing steps such as sequence alignment. Results obtained experimentally are very positive and suggest that the proposed techniques can be successfully used in a variety of related problems. The reported experiments demonstrate better performance than some of the state-of-the-art methods. We report on a new distance measure between n-gram profiles, which shows superior performance compared to many other measures, including commonly used Euclidean distance.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Methods and Programs in Biomedicine - Volume 81, Issue 2, February 2006, Pages 137–153

نویسندگان

Andrija Tomović, Predrag Janičić, Vlado Kešelj,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

n-Gram-based classification and unsupervised hierarchical clustering of genome sequences

دسترسی سریع

ارتباط

English Website