کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4956549 1444525 2016 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Combining instance selection for better missing value imputation
ترجمه فارسی عنوان
ترکیبی از انتخاب نمونه برای بدست آوردن ارزش بیشتر از دست رفته
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی
In practice, the data collected from data mining usually contain some missing values. Imputation is the process of replacing the missing values in incomplete datasets. It is usually based on providing estimations for missing values by reasoning from the observed data. Consequently, the effectiveness of missing value imputation is heavily dependent on the observed data (or complete data) in the incomplete datasets. The objective of this study is to investigate the effect of performing instance selection to filter out some noisy data (or outliers) from a given dataset on the imputation task. Specifically, four different processes for combining instance selection and missing value imputation are proposed and compared in terms of data classification. The experimental results based on 29 datasets containing categorical, numerical, and mixed attribute types of data show that the process of performing instance selection first and imputation second allows the k-NN and SVM classifiers to outperform the other processes over the categorical and numerical datasets. For the mixed type of datasets, k-NN performs the best when instance selection is performed again on the datasets produced by the second process. Finally, some specific decision rules about when to employ which process are also provided for future research.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems and Software - Volume 122, December 2016, Pages 63-71
نویسندگان
, ,