Article ID Journal Published Year Pages File Type
417379 Computational Statistics & Data Analysis 2006 16 Pages PDF
Abstract

Distributional and asymptotic results on the moment of Rand's CkCk statistic were derived by DuBien and Warde [1981. Some distributional results concerning a comparative statistic used in cluster analysis. ASA Proceedings of the Social Statistics Section, 309–313.]. Based on those results, a method to predict the number of clusters is suggested by applying various agglomerative clustering algorithms. In the procedure, the methods using different indexes are examined and compared based on the concept of agreement (or, disagreement) between clusterings generated by different clustering algorithms on the set of data. Our method having practical generality works better than the other methods and assigns statistical meaning to CkCk values in determining the number of clusters from the comparison.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,