Article ID Journal Published Year Pages File Type
489589 Procedia Computer Science 2015 9 Pages PDF
Abstract

We have been developing the getRNIA software tool for data mining under uncertain information. The getRNIA software tool is powered by the NIS-Apriori algorithm, which is a variation of the well-known Apriori algorithm. This paper considers the parallelization of the NIS-Apriori algorithm, and implements a part of this algorithm based on the Apache-Spark environment. We especially apply the implemented software to two data sets, the Mammographic data set and the Mushroom data set in order to show the property of the parallelization. Even though this parallelization was not so effective for the Mammographic data set, it was much more effective for the Mushroom data set.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)