Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
489589 | Procedia Computer Science | 2015 | 9 Pages |
Abstract
We have been developing the getRNIA software tool for data mining under uncertain information. The getRNIA software tool is powered by the NIS-Apriori algorithm, which is a variation of the well-known Apriori algorithm. This paper considers the parallelization of the NIS-Apriori algorithm, and implements a part of this algorithm based on the Apache-Spark environment. We especially apply the implemented software to two data sets, the Mammographic data set and the Mushroom data set in order to show the property of the parallelization. Even though this parallelization was not so effective for the Mammographic data set, it was much more effective for the Mushroom data set.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science (General)