کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536225 870482 2015 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multi-objective genetic algorithm for missing data imputation
ترجمه فارسی عنوان
الگوریتم ژنتیک چند هدفه برای حذف محاسبه داده ها
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• The paper proposes a novel Multi-objective Genetic Algorithm for Data Imputation, called MOGAImp.
• This is the first method that applies a multi-objective approach to data imputation.
• MOGAImp presents a good tradeoff between the evaluation measures studied.
• The results confirm the MOGAImp prevalence for utilization over conflicting evaluation measures.
• MOGAImp codification scheme makes possible to adapt it to different application domains.

A large number of techniques for data analyses have been developed in recent years, however most of them do not deal satisfactorily with a ubiquitous problem in the area: the missing data. In order to mitigate the bias imposed by this problem, several treatment methods have been proposed, highlighting the data imputation methods, which can be viewed as an optimization problem where the goal is to reduce the bias caused by the absence of information. Although most imputation methods are restricted to one type of variable whether categorical or continuous. To fill these gaps, this paper presents the multi-objective genetic algorithm for data imputation called MOGAImp, based on the NSGA-II, which is suitable for mixed-attribute datasets and takes into account information from incomplete instances and the modeling task. A set of tests for evaluating the performance of the algorithm were applied using 30 datasets with induced missing values; five classifiers divided into three classes: rule induction learning, lazy learning and approximate models; and were compared with three techniques presented in the literature. The results obtained confirm the MOGAImp outperforms some well-established missing data treatment methods. Furthermore, the proposed method proved to be flexible since it is possible to adapt it to different application domains.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 68, Part 1, 15 December 2015, Pages 126–131
نویسندگان
, , , , , , ,