Robust two-gene classifiers for cancer prediction

Article ID	Journal	Published Year	Pages	File Type
2821054	Genomics	2012	6 Pages	PDF

Abstract

Two-gene classifiers have attracted a broad interest for their simplicity and practicality. Most existing two-gene classification algorithms were involved in exhaustive search that led to their low time-efficiencies. In this study, we proposed two new two-gene classification algorithms which used simple univariate gene selection strategy and constructed simple classification rules based on optimal cut-points for two genes selected. We detected the optimal cut-point with the information entropy principle. We applied the two-gene classification models to eleven cancer gene expression datasets and compared their classification performance to that of some established two-gene classification models like the top-scoring pairs model and the greedy pairs model, as well as standard methods including Diagonal Linear Discriminant Analysis, k-Nearest Neighbor, Support Vector Machine and Random Forest. These comparisons indicated that the performance of our two-gene classifiers was comparable to or better than that of compared models.

► We proposed two genuine two-gene classifiers for cancer prediction. ► Our models used simple univariate gene selection strategy. ► Our models used simple classification rules built by information entropy principle. ► Our models had comparable performance to existing methods. ► Simple models have substantial advantages over complicated ones.

Keywords

Information entropy Computational biology Cancer Classification gene expression profiling