Article ID Journal Published Year Pages File Type
534487 Pattern Recognition Letters 2015 9 Pages PDF
Abstract

•A novel alternative clustering algorithm, SmIB, is proposed.•SmIB algorithm is based on the multivariate Information Bottleneck method.•The existing reference clusterings are viewed as one type of side information.•The multivariate Information Bottleneck guarantees the quality of new clustering.•SmIB algorithm can be used to analyze co-occurrence and non co-occurrence data.

Traditional clustering algorithms aim to find a single clustering of data. However, it is difficult to put an accurate interpretation on the complex data and there will be multiple different meaningful explanations. For such situation, this paper presents a novel alternative clustering algorithm, which takes existing reference clusterings as side information and incorporates such information into the multivariate Information Bottleneck (IB) method. The side information is used to lead the learning algorithm to generate an alternative clustering that is different from the existing reference clusterings, while the multivariate IB method guarantees the quality of new clustering results. Our method has the ability to incorporate multiple existing reference clusterings into the alternative cluster learning process, and can be used to analyze both co-occurrence data and non co-occurrence data. Moreover, our method is able to discover non-linear alternative clusterings. The experimental results on synthetic and real-world datasets demonstrate that the performance of the proposed algorithm is superior to the existing state-of-the-art alternative clustering algorithms.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,