کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
536013 | 870429 | 2011 | 8 صفحه PDF | دانلود رایگان |

In high-dimensional data, clusters of objects usually exist in subspaces; besides, different clusters probably have different shape volumes. Most existing methods for high-dimensional data clustering, however, only consider the former factor. They ignore the latter factor by assuming the same shape volume value for different clusters. In this paper we propose a new Gaussian mixture model (GMM) type algorithm for discovering clusters with various shape volumes in subspaces. We extend the GMM clustering method to calculate a local weight vector as well as a local variance within each cluster, and use the weight and variance values to capture main properties that discriminate different clusters, including subsets of relevant dimensions and shape volumes. This is achieved by introducing negative entropy of weight vectors, along with adaptively-chosen coefficients, into the objective function of the extended GMM. Experimental results on both synthetic and real datasets show that the proposed algorithm outperforms its competitors, especially when applying to high-dimensional datasets.
► The algorithm uses a new parameter of shape volume for each cluster.
► By applying a novel scheme, the algorithm solves the model parameters.
► An adaptive way is presented to obtain coefficients when applying the scheme.
► The algorithm discovers clusters with various shape volumes in subspaces.
Journal: Pattern Recognition Letters - Volume 32, Issue 8, 1 June 2011, Pages 1154–1161