Article ID Journal Published Year Pages File Type
6857912 Information Sciences 2014 20 Pages PDF
Abstract
Attribute reduction is the key technique for knowledge acquisition in rough set theory. However, it is still a challenging task to perform attribute reduction on massive data. During the process of attribute reduction on massive data, the key to improving the reduction efficiency is the effective computation of equivalence classes and attribute significance. Aiming at this problem, we propose several parallel attribute reduction algorithms in this paper. Specifically, we design a novel structure of 〈key,value〉 pair to speed up the computation of equivalence classes and attribute significance and parallelize the traditional attribute reduction process based on MapReduce mechanism. The different parallelization strategies of attribute reduction are also compared and analyzed from the theoretic view. Abundant experimental results demonstrate the proposed parallel attribute reduction algorithms can perform efficiently and scale well on massive data.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,