Article ID Journal Published Year Pages File Type
532047 Pattern Recognition 2015 13 Pages PDF
Abstract

•Geodesic based hybrid similarity measures are proposed for approximate spectral clustering.•Neighborhood graph, required for geodesic approach, is effectively constructed by weighted Delaunay triangulation (CONN).•The proposed geodesic based hybrid similarities outperform in terms of both accuracies and cluster validity indices.•The proposed geodesic based hybrid similarities can be powerful for clustering of large remote sensing images.•The proposed similarities are significant especially for quantization based approximate spectral clustering.

Spectral clustering has been popular thanks to its ability to extract clusters of varying characteristics without using a parametric model in expense of high computational cost required for eigendecomposition of pairwise similarities. In order to utilize its advantages in large datasets where it is infeasible due to its computational burden, approximate spectral clustering (ASC) methods apply spectral clustering on a reduced set of points (data representatives) selected by sampling or quantization. This two-step approach (i.e. finding the representatives and then clustering them) brings new opportunities for precise similarity definition such as manifold based topological relations, data distribution within the Voronoi polyhedra of the representatives, and their geodesic distance information, which are often ignored in similarity definition for ASC. In this study, we propose geodesic based hybrid similarity criteria which enable the use of different types of information for accurate similarity representation in ASC. Despite the fact that geodesic concept has been widely used in clustering, our contribution is the unique way of representing data topology to form geodesic relations and jointly harnessing various information types including topology, distance and density. The proposed criteria are tested using both sampling (selective sampling) and quantization (neural gas and k-means++) approaches. Experiments on artificial datasets, well-known small/medium-size real datasets, and four large datasets (four remote-sensing images), with different types of clusters, show that the proposed geodesic based hybrid similarity criteria outperform traditional similarity criteria in terms of clustering accuracies and several cluster validity indices.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,