Article ID Journal Published Year Pages File Type
530919 Pattern Recognition 2014 13 Pages PDF
Abstract

•We propose a new semi-supervised clustering algorithm given pairwise constraints.•It can expand the influence of the pairwise constraints locally and proportionally.•It utilizes the pairwise constraints in an “edge→vertex→edgeedge→vertex→edge” manner.•It introduces an intermediate structure to uncover the underlying sub-structures.•Given a k-NN sparse similarity matrix, its time complexity is approximately linear.

A key issue of semi-supervised clustering is how to utilize the limited but informative pairwise constraints. In this paper, we propose a new graph-based constrained clustering algorithm, named SCRAWL. It is composed of two random walks with different granularities. In the lower-level random walk, SCRAWL partitions the vertices (i.e., data points) into constrained and unconstrained ones, according to whether they are in the pairwise constraints. For every constrained vertex, its influence range, or the degrees of influence it exerts on the unconstrained vertices, is encapsulated in an intermediate structure called component. The edge set between each pair of components determines the affecting scope of the pairwise constraints. In the higher-level random walk, SCRAWL enforces the pairwise constraints on the components, so that the constraint influence can be propagated to the unconstrained edges. At last, we combine the cluster membership of all the components to obtain the cluster assignment for each vertex. The promising experimental results on both synthetic and real-world data sets demonstrate the effectiveness of our method.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,