Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
417379 | Computational Statistics & Data Analysis | 2006 | 16 Pages |
Distributional and asymptotic results on the moment of Rand's CkCk statistic were derived by DuBien and Warde [1981. Some distributional results concerning a comparative statistic used in cluster analysis. ASA Proceedings of the Social Statistics Section, 309–313.]. Based on those results, a method to predict the number of clusters is suggested by applying various agglomerative clustering algorithms. In the procedure, the methods using different indexes are examined and compared based on the concept of agreement (or, disagreement) between clusterings generated by different clustering algorithms on the set of data. Our method having practical generality works better than the other methods and assigns statistical meaning to CkCk values in determining the number of clusters from the comparison.