Article ID Journal Published Year Pages File Type
530408 Pattern Recognition 2014 12 Pages PDF
Abstract

•Propose a new method of estimating Information Theoretic measures using KNN.•Introduce a hierarchical clustering routine using this estimate.•Use two different values for k depending on which information theoretic measure is being estimated.•Avoid having to tune a critical parameter for each clustering task.•Handles datasets of different scales well compared to traditional methods.

We develop a new non-parametric information theoretic clustering algorithm based on implicit estimation of cluster densities using the k-nearest neighbors (k-nn) approach. Compared to a kernel-based procedure, our hierarchical k-nn approach is very robust with respect to the parameter choices, with a key ability to detect clusters of vastly different scales. Of particular importance is the use of two different values of k, depending on the evaluation of within-cluster entropy or across-cluster cross-entropy, and the use of an ensemble clustering approach wherein different clustering solutions vote in order to obtain the final clustering. We conduct clustering experiments, and report promising results.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,