Article ID Journal Published Year Pages File Type
6861469 Knowledge-Based Systems 2018 9 Pages PDF
Abstract
We propose a new ensemble method to overcome this drawback using the Apache Spark platform and PCA for dimension reduction, named Principal Components Analysis Random Discretization Ensemble. Experimental results on five large-scale datasets show that our solution outperforms both the original algorithm and Random Forest in terms of prediction performance. Results also show that high dimensionality data can affect the runtime of the algorithm.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,