کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
410341 679137 2013 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Certainty-based active learning for sampling imbalanced datasets
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Certainty-based active learning for sampling imbalanced datasets
چکیده انگلیسی


• Our method solves the imbalanced data classification problem in active learning.
• We utilize local behaviors in specific areas to identify queried samples.
• The query probability of a sample is determined within the explored neighborhood.
• Each neighborhood incrementally explored without defining its size in advance.

Active learning is to learn an accurate classifier within as few queried labels as possible. For practical applications, we propose a Certainty-Based Active Learning (CBAL) algorithm to solve the imbalanced data classification problem in active learning. Without being affected by irrelevant samples which might overwhelm the minority class, the importance of each unlabeled sample is carefully measured within an explored neighborhood. For handling the agnostic case, IWAL-ERM is integrated into our approach without costs. Thus our CBAL is designed to determine the query probability within an explored neighborhood for each unlabeled sample. The potential neighborhood is incrementally explored, and there is no need to define the neighborhood size in advance. In our theoretical analysis, it is presented that CBAL has a polynomial label query improvement over passive learning. And the experimental results on synthetic and real-world datasets show that, CBAL has the ability of identifying informative samples and dealing with the imbalanced data classification problem in active learning.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 119, 7 November 2013, Pages 350–358
نویسندگان
, ,