کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
533738 870161 2015 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Modified criterion to select useful unlabeled data for improving semi-supervised support vector machines
ترجمه فارسی عنوان
معیار اصلاح شده برای انتخاب داده های بدون برچسب مناسب برای بهبود ماشین های برش پشتیبانی نیمه نظارت
کلمات کلیدی
یادگیری نیمه نظارتی، تقویت نیمه نظارت، ماشین آلات بردار پشتیبانی، ماشین های بردار پشتیبانی نیمه نظارت
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• A small amount of unlabeled data was selected to enhance classification accuracy of S3VMs.
• To select them efficiently, impacts of the labeled data and the unlabeled data were balanced.
• The class-conditional probabilities of unlabeled samples were utilized as uncertainty levels.
• Run-time characteristics and error rates of the modified criterion were empirically evaluated.

Recent studies have demonstrated that semi-supervised learning (SSL) approaches that use both labeled and unlabeled data are more effective and robust than those that use only labeled data. In SemiBoost, a boosting framework for SSL, a similarity based criterion is developed to select (and utilize) a small amount of useful unlabeled data. However, sometimes it does not work appropriately, particularly when the unlabeled data are near the boundary. In order to address this concern, in this paper the selection criterion is modified using the class-conditional probability in addition to the similarity: first, the criterion is decomposed into three terms of positive class term, negative class term, and unlabeled term; second, when computing the confidences of unlabeled data, using the conditional probability estimated, impacts of the three terms on the confidences are adjusted; third, some unlabeled data that have higher confidences are selected and, together with labeled data, used for re-training a supervised classifier. This select-and-train process is repeated until a termination condition is met. The experimental results, obtained using semi-supervised support vector machines (S3VMs) with benchmark data, demonstrate that the proposed algorithm can compensate for the shortcomings of the traditional S3VMs and, when compared with previous approaches, can achieve further improved results in terms of the classification accuracy.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volumes 60–61, 1 August 2015, Pages 48–56
نویسندگان
, ,