Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
1152310 | Statistics & Probability Letters | 2011 | 11 Pages |
Abstract
Clustering is the problem of partitioning data into a finite number k of homogeneous and separate groups, called clusters. A good choice of k is essential for building meaningful clusters. In this paper, this task is addressed from the point of view of model selection via penalization. We design an appropriate penalty shape and derive an associated oracle-type inequality. The method is illustrated on both simulated and real-life data sets.
Related Topics
Physical Sciences and Engineering
Mathematics
Statistics and Probability
Authors
Aurélie Fischer,