کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
408247 679014 2016 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Positive and unlabeled learning in categorical data
ترجمه فارسی عنوان
یادگیری مثبت و بدون برچسب در داده های طبقه بندی شده
کلمات کلیدی
یادگیری بدون برچسب، یادگیری به طور جزئی تحت نظارت، آموزش از راه دور، داده های طبقه بندی شده
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• We propose a distance learning framework for PU learning with categorical data.
• Differently from existing methods for categorical data, our method is not based on the independence assumption.
• Our distance learning approach leverages attribute context.
• The experiments show statistically significant improvements in terms of prediction rate.
• The results are stable and robust with respect to parameter variations.

In common binary classification scenarios, the presence of both positive and negative examples in training data is needed to build an efficient classifier. Unfortunately, in many domains, this requirement is not satisfied and only one class of examples is available. To cope with this setting, classification algorithms have been introduced that learn from Positive and Unlabeled (PU) data. Originally, these approaches were exploited in the context of document classification. Only few works address the PU problem for categorical datasets. Nevertheless, the available algorithms are mainly based on Naive Bayes classifiers. In this work we present a new distance based PU learning approach for categorical data: Pulce. Our framework takes advantage of the intrinsic relationships between attribute values and exceeds the independence assumption made by Naive Bayes. Pulce, in fact, leverages on the statistical properties of the data to learn a distance metric employed during the classification task. We extensively validate our approach over real world datasets and demonstrate that our strategy obtains statistically significant improvements w.r.t. state-of-the-art competitors.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 196, 5 July 2016, Pages 113–124
نویسندگان
, ,