کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
536291 | 870492 | 2006 | 14 صفحه PDF | دانلود رایگان |
Prototype selection on the basis of conventional clustering algorithms results in good representation but is extremely time-taking on large data sets. kd-trees, on the other hand, are exceptionally efficient in terms of time and space requirements for large data sets, but fail to produce a reasonable representation in certain situations. We propose a new algorithm with speed comparable to the present kd-tree based algorithms which overcomes the problems related to the representation for high condensation ratios. It uses the Maxdiff criterion to separate out distant clusters in the initial stages before splitting them any further thus improving on the representation. The splits being axis-parallel, more nodes would be required for the representing a data set which has no regions where the points are well separated.
Journal: Pattern Recognition Letters - Volume 27, Issue 3, February 2006, Pages 187–200