Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
530408 | Pattern Recognition | 2014 | 12 Pages |
•Propose a new method of estimating Information Theoretic measures using KNN.•Introduce a hierarchical clustering routine using this estimate.•Use two different values for k depending on which information theoretic measure is being estimated.•Avoid having to tune a critical parameter for each clustering task.•Handles datasets of different scales well compared to traditional methods.
We develop a new non-parametric information theoretic clustering algorithm based on implicit estimation of cluster densities using the k-nearest neighbors (k-nn) approach. Compared to a kernel-based procedure, our hierarchical k-nn approach is very robust with respect to the parameter choices, with a key ability to detect clusters of vastly different scales. Of particular importance is the use of two different values of k, depending on the evaluation of within-cluster entropy or across-cluster cross-entropy, and the use of an ensemble clustering approach wherein different clustering solutions vote in order to obtain the final clustering. We conduct clustering experiments, and report promising results.