کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533619 870138 2010 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A linguistic approach to classification of bacterial genomes
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
A linguistic approach to classification of bacterial genomes
چکیده انگلیسی

In the present paper, 188 prokaryote genomes are classified by separately calculating the compositional spectra for the coding and the non-coding parts of the genomes. For each subsequence, the compositional spectrum is transformed into the corresponding point in a vector space. This enables the categorization of genomes into meaningful groups by a formal method. Repeated clustering performed for the coding and the non-coding genome parts makes it possible to estimate the true number of the genome clusters. The method we propose is based on a new application of external cluster validation indexes and on the misclassified quantities obtained in the process of repeated clustering. Besides, we have constructed additional data embedding into the appropriate Euclidean space only on the basis of the distances between compositional spectra. Biological evaluation of the results obtained for the 4-letter and the 2-letter alphabets substantiates the appropriateness of the resulting cluster-based classification.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 43, Issue 3, March 2010, Pages 1083–1093
نویسندگان
, , , , ,