Optimal classifiers with minimum expected error within a Bayesian framework

Article ID	Journal	Published Year	Pages	File Type
532222	Pattern Recognition	2013	14 Pages	PDF

Abstract

In recent years, biomedicine has faced a flood of difficult small-sample phenotype discrimination problems. A host of classification rules have been proposed to discriminate types of pathology, stages of disease and other diagnoses. Typically, these classification rules are heuristic algorithms, with very little understood about their performance. To give a concrete mathematical structure to the problem, recent work has utilized a Bayesian modeling framework based on an uncertainty class of feature-label distributions to both optimize and analyze error estimator performance. The current study uses the same Bayesian framework to also optimize classifier design. This completes a Bayesian theory of classification, where both the classifier error and the estimate of the error may be optimized and studied probabilistically within the model framework. This paper, the first of a two-part study, derives optimal classifiers in discrete and Gaussian models, demonstrates their superior performance over popular classifiers within the assumed model, and applies the method to real genomic data. The second part of the study discusses properties of these optimal Bayesian classifiers.

► Recent work uses a Bayesian modeling framework to optimize and analyze classifier error estimates. ► Here we use the same Bayesian framework to also optimize classifier design. ► This work thus completes a Bayesian theory of classification based on optimizing performance. ► Here, in Part I, we derive optimal Bayesian classifiers in both the discrete and Gaussian models. ► Superior performance is shown in synthetic data studies under the assumed model, and in real data.

Keywords

Bayesian estimation Error estimation Classification Small samples Genomics