An active learning paradigm based on a priori data reduction and organization

Article ID	Journal	Published Year	Pages	File Type
383620	Expert Systems with Applications	2014	12 Pages	PDF

Abstract

•A novel active learning paradigm, called DROP, based on a priori data reduction and organization.•DROP does not require classification and reorganization of all non-annotated samples in the dataset at each iteration.•The proposed paradigm allows to achieve high accuracy quickly with minimum user interaction.•Results are shown with different clustering and classification strategies, and on a variety of real-world datasets.

In the past few years, active learning has been reasonably successful and it has drawn a lot of attention. However, recent active learning methods have focused on strategies in which a large unlabeled dataset has to be reprocessed at each learning iteration. As the datasets grow, these strategies become inefficient or even a tremendous computational challenge. In order to address these issues, we propose an effective and efficient active learning paradigm which attains a significant reduction in the size of the learning set by applying an a priori process of identification and organization of a small relevant subset. Furthermore, the concomitant classification and selection processes enable the classification of a very small number of samples, while selecting the informative ones. Experimental results showed that the proposed paradigm allows to achieve high accuracy quickly with minimum user interaction, further improving its efficiency.

Keywords

Pattern recognition Image annotation Data mining Active learning Machine learning