Article ID Journal Published Year Pages File Type
402457 Knowledge-Based Systems 2016 13 Pages PDF
Abstract

Matrix factorization based techniques, such as nonnegative matrix factorization (NMF) and concept factorization (CF), have attracted a great deal of attentions in recent years, mainly due to their ability of dimension reduction and sparse data representation. Both techniques are of unsupervised nature and thus do not make use of a priori knowledge to guide the clustering process. This could lead to inferior performance in some scenarios. As a remedy to this, a semi-supervised learning method called Pairwise Constrained Concept Factorization (PCCF) was introduced to incorporate some pairwise constraints into the CF framework. Despite its improved performance, PCCF uses only a priori knowledge and neglects the proximity information of the whole data distribution; this could lead to rather poor performance (although slightly improved comparing to CF) when only limited a priori information is available. To address this issue, we propose in this paper a novel method called Constrained Neighborhood Preserving Concept Factorization (CNPCF). CNPCF utilizes both a priori knowledge and local geometric structure of the dataset to guide its clustering. Experimental studies on three real-world clustering tasks demonstrate that our method yields a better data representation and achieves much improved clustering performance in terms of accuracy and mutual information comparing to the state-of-the-arts techniques.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,