کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4500005 | 1624014 | 2015 | 8 صفحه PDF | دانلود رایگان |
• We integrate GO semantic similarity into affinity propagation clustering.
• Neighborhood rough set is applied on the biologically relevant clusters for gene selection.
• Quantitative analysis is done to observe the impact of biological similarity on the results.
• Ensemble classifier is built to enhance the robustness and generalization.
Classification of microarray data has always been a challenging task because of the enormous number of genes. In this study, a clustering method by integrating plant stress response gene expression data with biological knowledge is presented. Clustering is one of the promising tools for attribute reduction, but gene clusters are biologically uninformative. So we integrated biological knowledge into genomic analysis to help to improve the interpretation of the results. Biological similarity based on gene ontology (GO) semantic similarity was combined with gene expression data to find out biologically meaningful clusters. Affinity propagation clustering algorithm was chosen to analyze the impact of the biological similarity on the results. Based on clustering result, neighborhood rough set was used to select representative genes for each cluster. The prediction accuracy of classifiers built on reduced gene subsets indicated that our approach outperformed other classical methods. The information fusion was proven to be effective through quantitative analysis, as it could select gene subsets with high biological significance and select significant genes.
Figure optionsDownload as PowerPoint slide
Journal: Mathematical Biosciences - Volume 266, August 2015, Pages 65–72