کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6941381 870256 2014 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On incrementally using a small portion of strong unlabeled data for semi-supervised learning algorithms
ترجمه فارسی عنوان
با استفاده از تدریج با استفاده از یک بخش کوچک از داده های بدون برچسب قوی برای الگوریتم های یادگیری نیمه نظارت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
The aim of this paper is to present an incremental selection strategy by which the classification accuracy of semi-supervised learning (SSL) algorithms can be improved. In SSL, both a limited number of labeled and a multitude of unlabeled data are utilized to learn a classification model. However, it is also well known that the utilization of the unlabeled data is not always helpful for SSL algorithms. To efficiently use them in learning the classification model, some of the unlabeled data that are deemed useful for the learning process are selected and given the correctly estimated labels. To address this problem, especially when dealing with semi-supervised MarginBoost (SSMB) algorithm (d'Alché-Buc et al., 2002), in this paper, two selection strategies, named simply recycled selection and incrementally reinforced selection, are considered and empirically compared. Our experimental results, obtained with well-known benchmark data sets, including SSL-type benchmarks and some UCI data sets, demonstrate that the latter, i.e., selecting only a small portion of strong examples from the available unlabeled data in an incremental fashion, can compensate for the shortcomings of the existing SSMB algorithm. Moreover, compared to the former, it generally achieves better classification accuracy results.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 41, 1 May 2014, Pages 53-64
نویسندگان
, ,