کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
494781 862807 2016 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature based quality assessment of DNA sequencing chromatograms
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Feature based quality assessment of DNA sequencing chromatograms
چکیده انگلیسی

Although next generation sequencing applications are getting dominant in molecular genetics, there are still many institutions that want to utilize their legacy sequencers as much as possible. An important concern in sequencing services is the quality of trace files presented to the customers. In this respect, the quality of the trace files should be screened and low quality files should be handled differently before reaching to customers. The quality scores already present in the trace files provide some useful information, however by incorporating auxiliary information we can improve to reliability of these scores. To this end, we used a feature based supervised classification strategy which requires a set of training and testing trace files qualities of which are determined manually. We tested several machine learning algorithms, namely k-nearest neighbors, Naive Bayes, Support Vector Machines and Random Forest, on a public DNA trace repository. Our results indicate that RF method with only 4 simple features provides a classification accuracy rate of 94.68% with a high level of reliability of concurrence (Kappa = 0.8679).

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 41, April 2016, Pages 420–427
نویسندگان
, , , , ,