Article ID Journal Published Year Pages File Type
10332467 Journal of Computational Science 2013 7 Pages PDF
Abstract
► We present a communication-efficient parallel formulation for the k-means clustering algorithm based on KD-trees. ► The algorithm does not require global communication and can dynamically select subsets of processes for group communication. ► The algorithm can provide the exact deterministic solution of an equivalent sequential k-means algorithm, i.e., run over the aggregated data. ► The method can also improve its communication efficiency further by approximating the centralised k-means algorithm as closely as desired.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,