Article ID Journal Published Year Pages File Type
1152310 Statistics & Probability Letters 2011 11 Pages PDF
Abstract
Clustering is the problem of partitioning data into a finite number k of homogeneous and separate groups, called clusters. A good choice of k is essential for building meaningful clusters. In this paper, this task is addressed from the point of view of model selection via penalization. We design an appropriate penalty shape and derive an associated oracle-type inequality. The method is illustrated on both simulated and real-life data sets.
Related Topics
Physical Sciences and Engineering Mathematics Statistics and Probability
Authors
,