Article ID Journal Published Year Pages File Type
1147911 Journal of Statistical Planning and Inference 2009 6 Pages PDF
Abstract

Let us have a probability distribution P   (possibly empirical) on the real line RR. Consider the problem of finding the k-mean of P, i.e. a set A of at most k points that minimizes given loss-function. It is known that the k-mean can be found using an iterative algorithm by Lloyd [1982. Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 129–136]. However, depending on the complexity of the distribution P, the application of this algorithm can be quite resource-consuming. One possibility to overcome the problem is to group the original data and calculate the k-mean on the basis of the grouped data. As a result, the new k-mean will be biased, and our aim is to measure the loss of the quality of approximation caused by such approach.

Keywords
Related Topics
Physical Sciences and Engineering Mathematics Applied Mathematics
Authors
, ,