Latent tree models for rounding in spectral clustering

Article ID	Journal	Published Year	Pages	File Type
412217	Neurocomputing	2014	15 Pages	PDF

Abstract

In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the so-called Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data points using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of clusters, and to partition the data points. In this paper, we propose a novel method using latent tree models for rounding. The method differs from previous rounding methods in three ways. First, we relax the assumption that the number of clusters equals the number of eigenvectors used. Second, when deciding how many leading eigenvectors to use, we not only rely on information contained in the leading eigenvectors themselves, but also make use of the subsequent eigenvectors. Third, our method is model-based and solves all the three subproblems of rounding using latent tree models. We evaluate our method on both synthetic and real-world data. The results show that our method works correctly in the ideal case where between-clusters similarity is 0, and degrades gracefully as one moves away from the ideal case.

Keywords

Spectral clustering Rounding