کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2814770 1159828 2016 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Identification of species based on DNA barcode using k-mer feature vector and Random forest classifier
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
پیش نمایش صفحه اول مقاله
Identification of species based on DNA barcode using k-mer feature vector and Random forest classifier
چکیده انگلیسی


• A computational approach was developed for species identification by analyzing the DNA barcode sequence.
• The developed approach achieved higher species identification success rates as compared to ad-hoc approaches.
• Based on developed approach, a web interface SPIDBAR was developed for identification of species by taxonomists.
• The proposed approach is believed to supplement the existing machine-learning based approaches.

DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Gene - Volume 592, Issue 2, 5 November 2016, Pages 316–324
نویسندگان
, , ,