کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
439349 | 690530 | 2006 | 16 صفحه PDF | دانلود رایگان |

We consider the following general correlation-clustering problem [N. Bansal, A. Blum, S. Chawla, Correlation clustering, in: Proc. 43rd Annu. IEEE Symp. on Foundations of Computer Science, Vancouver, Canada, November 2002, pp. 238–250]: given a graph with real nonnegative edge weights and a 〈+〉/〈-〉 edge labelling, partition the vertices into clusters to minimize the total weight of cut 〈+〉 edges and uncut 〈-〉 edges. Thus, 〈+〉 edges with large weights (representing strong correlations between endpoints) encourage those endpoints to belong to a common cluster while 〈-〉 edges with large weights encourage the endpoints to belong to different clusters. In contrast to most clustering problems, correlation clustering specifies neither the desired number of clusters nor a distance threshold for clustering; both of these parameters are effectively chosen to be the best possible by the problem definition.Correlation clustering was introduced by Bansal et al. [Correlation clustering, in: Proc. 43rd Annu. IEEE Symp. on Foundations of Computer Science, Vancouver, Canada, November 2002, pp. 238–250], motivated by both document clustering and agnostic learning. They proved NP-hardness and gave constant-factor approximation algorithms for the special case in which the graph is complete (full information) and every edge has the same weight. We give an O(logn)-approximation algorithm for the general case based on a linear-programming rounding and the “region-growing’’ technique. We also prove that this linear program has a gap of Ω(logn), and therefore our approximation is tight under this approach. We also give an O(r3)-approximation algorithm for Kr,r-minor-free graphs. On the other hand, we show that the problem is equivalent to minimum multicut, and therefore APX-hard and difficult to approximate better than Θ(logn).
Journal: Theoretical Computer Science - Volume 361, Issues 2–3, 1 September 2006, Pages 172-187