Article ID Journal Published Year Pages File Type
426187 Future Generation Computer Systems 2011 14 Pages PDF
Abstract

This paper proposes three novel parallel clustering algorithms based on the Kohonen’s SOM aiming at preserving the topology of the original dataset for a meaningful visualization of the results and for discovering associations between features of the dataset by topological operations over the clusters. In all these algorithms the data to be clustered are subdivided among the nodes of a GRID. In the first two algorithms each node executes an on-line SOM, whereas in the third algorithm the nodes execute a quasi-batch SOM called MANTRA. The algorithms differ on how the weights computed by the slave nodes are recombined by a master to launch the next epoch of the SOM in the nodes. A proof outline demonstrates the convergence of the proposed parallel SOMs and provides indications on how to select the learning rate to outperform both the sequential SOM and the parallel SOMs available in the literature. A case study dealing with bioinformatics is presented to illustrate that by our parallel SOM we may obtain meaningful clusters in massive data mining applications at a fraction of the time needed by the sequential SOM, and that the obtained classification supports a fruitful knowledge extraction from massive datasets.

► A novel parallel clustering method is proposed for accelerating the Self Organizing Map (SOM) and preserving the topology of the original dataset. ► A proof outline demonstrates the convergence of the proposed parallel SOM. ► Heuristics are provided to maximize the speed-up and to minimize the topological error. ► Simulations show that the parallel SOM outperforms the currently available methods for parallelizing the SOM. ► A case study in bioinformatics illustrates how the method assists in discovering associations between data features from the topological properties of the clusters.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,