Competitive clustering algorithms based on ultrametric properties

Article ID	Journal	Published Year	Pages	File Type
429412	Journal of Computational Science	2013	13 Pages	PDF

Abstract

We propose in this paper two new competitive unsupervised clustering algorithms: the first algorithm deals with ultrametric data, it has a computational cost of O(n). The second algorithm has two strong features: it is fast and flexible on the processed data type as well as in terms of precision. The second algorithm has a computational cost, in the worst case, of O(n2), and in the average case, of O(n). These complexities are due to exploitation of ultrametric distance properties. In the first method, we use the order induced by an ultrametric in a given space to demonstrate how we can explore quickly data proximity. In the second method, we create an ultrametric space from a sample data, chosen uniformly at random, in order to obtain a global view of proximities in the data set according to the similarity criterion. Then, we use this proximity profile to cluster the global set. We present an example of our algorithms and compare their results with those of a classic clustering method.

► We proposed in this paper two novel competitive unsupervised clustering algorithms. ► They have optimal calculation times. ► The first algorithm treats ultrametric data, it has a complexity of O(n). ► The second algorithm deals with all kind of data. It has two strong particularities. ► It is the flexibility regarding the studied context, and a linear computational cost.

Keywords

Ultrametric Average analysis Amortized analysis Clustering Complexity