Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6861469 | Knowledge-Based Systems | 2018 | 9 Pages |
Abstract
We propose a new ensemble method to overcome this drawback using the Apache Spark platform and PCA for dimension reduction, named Principal Components Analysis Random Discretization Ensemble. Experimental results on five large-scale datasets show that our solution outperforms both the original algorithm and Random Forest in terms of prediction performance. Results also show that high dimensionality data can affect the runtime of the algorithm.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Diego GarcÃa-Gil, Sergio RamÃrez-Gallego, Salvador GarcÃa, Francisco Herrera,