Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6856401 | Information Sciences | 2018 | 14 Pages |
Abstract
Mining incomplete data using approximations based on characteristic sets is a well-established technique. It is applicable to incomplete data sets with a few interpretations of missing attribute values, e.g., lost values and “do not care” conditions. On the other hand, maximal consistent blocks were introduced for incomplete data sets with only “do not care” conditions, using only lower and upper approximations. In this paper we introduce an extension of the maximal consistent blocks to incomplete data sets with any interpretation of missing attribute values and with probabilistic approximations. We prove new results on probabilistic approximations based on generalized maximal consistent blocks. Additionally, we present results of experiments on mining incomplete data using both characteristic sets and maximal consistent blocks and using two interpretations of missing attribute values: lost values and “do not care” conditions. We show that there is some evidence that the best approach is using middle probabilistic approximations based on characteristic sets or on maximal consistent blocks.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Patrick G. Clark, Cheng Gao, Jerzy W. Grzymala-Busse, Teresa Mroczek,