Article ID Journal Published Year Pages File Type
10322116 Expert Systems with Applications 2014 16 Pages PDF
Abstract
This paper presents a Dynamic Clustering Algorithm for histogram data with an automatic weighting step of the variables by using adaptive distances. The Dynamic Clustering Algorithm is a k-means-like algorithm for clustering a set of objects into a predefined number of classes. Histogram data are realizations of particular set-valued descriptors defined in the context of Symbolic Data Analysis. We propose to use the ℓ2 Wasserstein distance for clustering histogram data and two novel adaptive distance based clustering schemes. The ℓ2 Wasserstein distance allows to express the variability of a set of histograms in two components: the first related to the variability of their averages and the second to the variability of the histograms related to different size and shape. The weighting step aims to take into account global and local adaptive distances as well as two components of the variability of a set of histograms. To evaluate the clustering results, we extend some classic partition quality indexes when the proposed adaptive distances are used in the clustering criterion function. Examples on synthetic and real-world datasets corroborate the proposed clustering procedure.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,