Better alternatives to current methods of scaling and weighting data for cluster analysis

Article ID	Journal	Published Year	Pages	File Type
1150595	Journal of Statistical Planning and Inference	2007	14 Pages	PDF

Abstract

Scaling of multivariate data prior to cluster analysis is important as a preprocessing step. Currently there are methods for doing this. This paper proposes some alternatives, which are particularly directed at helping reveal cluster structures in data. These methods are applied to simulated and real data sets and their performances are compared to some currently used methods. The results indicate that, in many situations, the new methods are much better than the most popular method, called autoscaling. In the most challenging clustering example considered, their performances, while poor, are no worse than all the currently used methods.

Keywords

Variable scaling Discriminant analysis Clustering Variable weighting