کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
381970 660712 2016 19 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An embedded imputation method via Attribute-based Decision Graphs
ترجمه فارسی عنوان
روش انتساب جاسازی شده از طریق نمودار های تصمیم گیری بر اساس صفت
کلمات کلیدی
مقادیر مشخصه گمشده ؛ انتساب داده ها؛ انتساب تنها؛ نمودار های تصمیم گیری بر اساس ویژگی ها ؛ یادگیری ماشین بر اساس روش انتساب
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی


• Attribute-based Decision Graphs represent the correlation among data attributes.
• Similar data instances induce similar subgraphs in the AbDG.
• Imputation partially matches instances to the AbDG searching for a proper subgraph.
• The method has low computational costs and handles high rates of missing values.
• Results show the method is efficient to impute data prior to classification tasks.

The performance of classification algorithms is highly dependent on the quality of training data. Missing attribute values are quite common in many real world applications, thus, in such cases, a complementary method to improve the quality of the data and, consequently, promote enhancements of the classifier performance, is necessary. To deal with this problem, two strategies are commonly employed in practice, 1) multiple imputation, which often maintains the statistical properties of the original data and, usually, has good performance, at the expense of high computational costs; 2) single imputation, which, in general, provides a suitable solution for data sets with a few missing attribute values, but hardly achieve good results when the number of missing values is high. This paper proposes a new single imputation method which uses Attribute-based Decision Graphs (AbDG) to estimate the missing values. AbDGs are a new type of data graphs which embed the information contained in the training set into a graph structure, built over pre-defined intervals of values from different attributes. As a consequence, similar data instances induce similar subgraphs when projected onto the AbDG, resulting in distinct patterns of connections. The main contribution of the paper is the proposal of a well-defined procedure to perform imputation, by partially matching instances with missing values against the AbDG. The proposed imputation method can effectively deal with data sets having high rates of missing attribute values while presenting low computational cost; a significant result towards the development of robust expert and intelligent systems. The obtained results show evidences that the proposed method is sound and promote qualitative imputation for classification purposes.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 57, 15 September 2016, Pages 159–177
نویسندگان
, , ,