Article ID Journal Published Year Pages File Type
6865916 Neurocomputing 2015 9 Pages PDF
Abstract
This paper proposes a methodology for identifying data samples that are likely to be mislabeled in a c-class classification problem (dataset). The methodology relies on an assumption that the generalization error of a model learned from the data decreases if a label of some mislabeled sample is changed to its correct class. A general classification model used in the paper is OP-ELM; it also provides a fast way to estimate the generalization error by PRESS Leave-One-Out. It is tested on two toy datasets, as well as on real life datasets for one of which expert knowledge about the identified potential mislabels has been sought.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , , , , ,