A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
377567	658795	2015	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Neural gas - گاز عصبی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network

چکیده انگلیسی

• We propose a new alignment-free method for the classification of DNA barcoding based on both a spectral representation and prototype-based unsupervised clustering.
• We investigate how much the characteristics of different species are related to their DNA barcoding spectral distribution.
• We compare the proposed method with six state-of-the-art machine learning classifiers and the results confirm our method overcome all the other classifiers when applied to short fragments.

ObjectivesIn this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed.MethodsIn the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource “Barcode of Life Database”.ResultsThe experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%.ConclusionsOur results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Artificial Intelligence in Medicine - Volume 64, Issue 3, July 2015, Pages 173–184

نویسندگان

Antonino Fiannaca, Massimo La Rosa, Riccardo Rizzo, Alfonso Urso,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network

دسترسی سریع

ارتباط

English Website