Article ID Journal Published Year Pages File Type
484825 Procedia Computer Science 2015 8 Pages PDF
Abstract

The emergence of the big data problem has pushed the machine learning research community to develop unsupervised, distributed and computationally efficient learning algorithms to benefit from this data. Extreme learning machines (ELM) have gained popularity as a neuron based architecture with fast training time and good generalization. In this work, we parallelize an ELM algorithm for unsupervised learning on a distributed framework to learn clustering models from big data based on the unsupervised ELM algorithm proposed in the literature. We propose three approaches to do so: 1) Parallel US-ELM which simply distributes the data over computing nodes, 2) Hierarchical US-ELM which hierarchically clusters the data and 3) Ensemble US- ELM which is an ensemble of weak ELM models. The algorithms achieved faster training times compared to their serial counterparts and generalized better than other clustering algorithms in the literature, when tested on multiple datasets from UCI.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)