Multilabel classifiers with a probabilistic thresholding strategy

Article ID	Journal	Published Year	Pages	File Type
533419	Pattern Recognition	2012	8 Pages	PDF

Abstract

In multilabel classification tasks the aim is to find hypotheses able to predict, for each instance, a set of classes or labels rather than a single one. Some state-of-the-art multilabel learners use a thresholding strategy, which consists in computing a score for each label and then predicting the set of labels whose score is higher than a given threshold. When this score is the estimated posterior probability, the selected threshold is typically 0.5.In this paper we introduce a family of thresholding strategies which take into account the posterior probability of all possible labels to determine a different threshold for each instance. Thus, we exploit some kind of interdependence among labels to compute this threshold, which is optimal regarding a given expected loss function. We found experimentally that these strategies outperform other thresholding options for multilabel classification. They provide an efficient method to implement a learner which considers the interdependence among labels in the sense that the overall performance of the prediction of a set of labels prevails over that of each single label.

► We deal with multilabel classification tasks. ► Our approach is an improvement of thresholding strategies. ► From the label's posterior probability we find the optimum threshold to predict labels. ► We optimize the expected loss for some loss functions, like F1, Accuracy.

Keywords

Posterior probability Expected loss Multilabel classification