Article ID Journal Published Year Pages File Type
532307 Pattern Recognition 2013 13 Pages PDF
Abstract

The excellence of a given learner is usually claimed through a performance comparison with other learners over a collection of data sets. Too often, researchers are not aware of the impact of their data selection on the results. Their test beds are small, and the selection of the data sets is not supported by any previous data analysis. Conclusions drawn on such test beds cannot be generalised, because particular data characteristics may favour certain learners unnoticeably. This work raises these issues and proposes the characterisation of data sets using complexity measures, which can be helpful for both guiding experimental design and explaining the behaviour of learners.

► Criticism of experimental assessment of learners based on arbitrary selected data sets. ► Inclusion of data complexity analysis in the experimental methodology. ► Data complexity analysis to understand learners' behaviours. ► Generation of synthetic data sets to identify domains of competence of learners.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,