Article ID Journal Published Year Pages File Type
6856401 Information Sciences 2018 14 Pages PDF
Abstract
Mining incomplete data using approximations based on characteristic sets is a well-established technique. It is applicable to incomplete data sets with a few interpretations of missing attribute values, e.g., lost values and “do not care” conditions. On the other hand, maximal consistent blocks were introduced for incomplete data sets with only “do not care” conditions, using only lower and upper approximations. In this paper we introduce an extension of the maximal consistent blocks to incomplete data sets with any interpretation of missing attribute values and with probabilistic approximations. We prove new results on probabilistic approximations based on generalized maximal consistent blocks. Additionally, we present results of experiments on mining incomplete data using both characteristic sets and maximal consistent blocks and using two interpretations of missing attribute values: lost values and “do not care” conditions. We show that there is some evidence that the best approach is using middle probabilistic approximations based on characteristic sets or on maximal consistent blocks.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,