کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4602796 | 1336938 | 2009 | 10 صفحه PDF | دانلود رایگان |

Given a dataset D partitioned in clusters, the joint distance function (JDF) J(x) at any point x is the harmonic mean of the distances between x and the cluster centers. The JDF is a continuous function, capturing the data points in its lower level sets (a property called contour approximation), and is a useful concept in probabilistic clustering and data analysis. In particular, contour approximation allows a compact representation of the data: for a dataset in Rn with N points organized in K clusters, the JDF requires K centers and covariances (if Mahalanobis distances are used), for a total of Kn(n+3)/2 parameters, and a considerable reduction of storage if N≫K,n. The JDF of the whole dataset, J(D)≔∑{J(x):x∈D}, is a measure of the classifiability of the data, and can be used to determine the “right” number of clusters for D. A duality theory for the JDF J(D) is given, in analogy with Kuhn’s geometric duality theory for the Fermat–Weber location problem. The JDF J(D) is the optimal value of a primal problem (P), for which a dual problem (D) is given, with a sharp lower bound on J(D).
Journal: Linear Algebra and its Applications - Volume 430, Issue 10, 1 May 2009, Pages 2771-2780