Predicting incomplete gene microarray data with the use of supervised learning algorithms

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
534410	870249	2010	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Microarray data - داده های Microarray Incomplete data - داده های ناقص Supervised learning - نظارت بر یادگیری Prediction - پیش بینی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Predicting incomplete gene microarray data with the use of supervised learning algorithms

چکیده انگلیسی

MotivationWith the wealth of sequence data and the huge amount of data generated from molecular technologies, the issue of gene classification/prediction has become a central challenge in the field of microarray data analysis. This has led to the application of many well-established supervised learning (SL) algorithms in an attempt to provide more accurate and automatic diagnosis class (cancer/non cancer) prediction. Virtually all research on SL addresses the task of learning to classify complete domain instances. However, in some research situations we often have to classify instances given incomplete vectors, which can affect the predictive accuracy of learned classifiers. The task of learning an accurate incomplete data classifier from instances raises a number of new issues some of which have not been properly addressed by bioinformatics research. Thus, an effective missing value estimation method is required for improving predictive accuracy.ResultsThe essence of the approach is the proposal that prediction using supervised learning can be improved in probabilistic terms given incomplete microarray data. This imputation approach is based on the a priori probability of each value determined from the instances at that node of a decision tree (PDT) that have specified values. The proposed approach exploits the total probability and Bayes’ theorems and it has three versions. We evaluate our approach with other supervised learning techniques including C5.0, classification and regression trees (CART), k-nearest neighbour (k-NN), linear discrimination (LD) naïve Bayes classifier (NBC), Repeated Incremental Pruning to Produce Error Reduction (RIPPER) and support vector machines (SVMs), from the point of view of their effect or tolerance of incomplete test data. Eight cancer related gene expression datasets are utilized for this task. Experimental results are provided to illustrate the efficiency and the robustness of the proposed algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 31, Issue 13, 1 October 2010, Pages 2061–2069

نویسندگان

Bhekisipho Twala, Motee Phorah,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Predicting incomplete gene microarray data with the use of supervised learning algorithms

دسترسی سریع

ارتباط

English Website