کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
402622 | 676968 | 2015 | 19 صفحه PDF | دانلود رایگان |
![عکس صفحه اول مقاله: Classification of microarray using MapReduce based proximal support vector machine classifier Classification of microarray using MapReduce based proximal support vector machine classifier](/preview/png/402622.png)
• Feature selection and classification of microarray data using Hadoop framework.
• MapReduce based statistical tests are proposed for feature selection (FS).
• MapReduce based PSVM (mrPSVM) classifier is proposed for classification of microarray.
• Comparative analysis of results is performed with FS methods in permutation with mrPSVM.
Microarray-based gene expression profiling has emerged as an efficient technique for classification, diagnosis, prognosis, and treatment of cancer disease. The nature of this disease changes frequently, this generates a huge volume of data. The data retrieved from microarray covers its varieties (veracity) of nature, and changes observed as time changes (velocity). Therefore, the analysis of microarray dataset in a very short period is essential. The major drawback of microarray data is the ‘curse of dimensionality problem’, this hinders the useful information of dataset and leads to computational instability. Therefore, selecting relevant genes is an imperative in microarray data analysis. Most of the existing schemes employ a two phase process: feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed to select the relevant features. After feature selection, MapReduce based proximal support vector machine (mrPSVM) classifier is also proposed to classify the microarray data. These algorithms are successfully implemented on Hadoop framework. A comparative analysis is done on these feature selection methodologies using microarray datasets of various dimensions. Experimental results show that the ensemble of mrPSVM classifier and various feature selection methods produces a better accuracy rate on the benchmark dataset.
Journal: Knowledge-Based Systems - Volume 89, November 2015, Pages 584–602