Article ID Journal Published Year Pages File Type
6938809 Pattern Recognition 2018 13 Pages PDF
Abstract
Many real-world applications propose the request for sharing knowledge among different tasks or datasets. Transfer learning has been proposed to solve this kind of problems and it has been successfully applied in supervised learning and semi-supervised learning settings. However, its adoption in clustering, one of the most classical research problems in machine learning and data mining, is still scarce. Spectral clustering, as a major clustering algorithm with wide applications and better performance than k-means typically, has not been well incorporated with knowledge transfer. In this paper, we first consider the problem of learning from only one auxiliary unlabeled dataset for spectral clustering and propose a novel algorithm called transfer spectral clustering (TSC). Then, it is extended to the settings with multiple auxiliary tasks. TSC assumes the feature embeddings being shared with the auxiliary tasks and utilizes co-clustering to extract useful information from the auxiliary datasets to improve the clustering performance. TSC involves not only the data manifold information of individual task but also the feature manifold information shared between related tasks. An in-depth explanation of our algorithm together with a convergence analysis are provided. As demonstrated by the extensive experiments, TSC can effectively improve the clustering performance by using auxiliary unlabeled data when compared with other state-of-the-art clustering algorithms.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,