کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530783 869787 2014 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A multiple criteria active learning method for support vector regression
ترجمه فارسی عنوان
روش یادگیری فعال چند معیاره برای رگرسیون بردار پشتیبانی
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We present an active learning method in the context of ε-insensitive SVR.
• The proposed method exploits relevancy, diversity and density criteria.
• The three criteria are implemented according to the SVR properties.
• All three criteria are applied in two consecutive steps.
• Experimental results show the robustness of the proposed technique.

This paper presents a novel active learning method developed in the framework of ε-insensitive support vector regression (SVR) for the solution of regression problems with small size initial training data. The proposed active learning method selects iteratively the most informative as well as representative unlabeled samples to be included in the training set by jointly evaluating three criteria: (i) relevancy, (ii) diversity, and (iii) density of samples. All three criteria are implemented according to the SVR properties and are applied in two clustering-based consecutive steps. In the first step, a novel measure to select the most relevant samples that have high probability to be located either outside or on the boundary of the ε-tube of SVR is defined. To this end, initially a clustering method is applied to all unlabeled samples together with the training samples that are inside the ε-tube (those that are not support vectors, i.e., non-SVs); then the clusters with non-SVs are eliminated. The unlabeled samples in the remaining clusters are considered as the most relevant patterns. In the second step, a novel measure to select diverse samples among the relevant patterns from the high density regions in the feature space is defined to better model the SVR learning function. To this end, initially clusters with the highest density of samples are chosen to identify the highest density regions in the feature space. Then, the sample from each selected cluster that is associated with the portion of feature space having the highest density (i.e., the most representative of the underlying distribution of samples contained in the related cluster) is selected to be included in the training set. In this way diverse samples taken from high density regions are efficiently identified. Experimental results obtained on four different data sets show the robustness of the proposed technique particularly when a small-size initial training set are available.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 47, Issue 7, July 2014, Pages 2558–2567
نویسندگان
, ,