Article ID Journal Published Year Pages File Type
4943334 Expert Systems with Applications 2017 29 Pages PDF
Abstract
This paper presents two different versions of a new internal index for clustering validation using graphs. These graphs capture the structural characteristics of each cluster. In this way, the new index overcomes the limitations of traditional indices based on statistics measurements and it is effective on clusters of different shapes and sizes. These graphs are generated through an iterative process based on the principal component analysis, which partitions the clusters in a configurable number of “sub-clusters”. Then, a minimum spanning tree based on the centroids of each of these sub-clusters is built and used to estimate both the quality of the clusters and the distances between them. In particular, the quality of a cluster is defined in this paper as the level of “cohesion” among its sub-clusters. The difference between the two versions of the proposed index is how this level of "cohesion" is measured. Finally, a comparison of the performance of these two versions of the proposed index with a selected group of well-known internal indices is carried out. In these tests, the two versions of the index show a superior capacity to deal with datasets that present different configurations of variances, densities, geometries and levels of noise.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,