کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
535129 870323 2009 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Feature subset selection from positive and unlabelled examples
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Feature subset selection from positive and unlabelled examples
چکیده انگلیسی

The feature subset selection problem has a growing importance in many machine learning applications where the amount of variables is very high. There is a great number of algorithms that can approach this problem in supervised databases but, when examples from one or more classes are not available, supervised feature subset selection algorithms cannot be directly applied. One of these algorithms is the correlation based filter selection (CFS). In this work we propose an adaptation of this algorithm that can be applied when only positive and unlabelled examples are available. As far as we know, this is the first time the feature subset selection problem is studied in the positive unlabelled learning context. We have tested this adaptation on synthetic datasets obtained by sampling Bayesian network models where we know which variables are (in)dependent of the class. We have also tested our adaptations on real-life databases where the absence of negative examples has been simulated. The results show that, having enough positive examples, it is possible to obtain good solutions to the feature subset selection problem when only positive and unlabelled instances are available.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 30, Issue 11, 1 August 2009, Pages 1027–1036
نویسندگان
, , ,