Article ID Journal Published Year Pages File Type
6903227 Applied Soft Computing 2018 23 Pages PDF
Abstract
Co-clustering refers to the simultaneous clustering of objects and their features. It is used as a clustering technique when the data exhibit similarities only in a subset of features instead of the whole feature set. Clustering (and co-clustering) has been proven to be an optimization problem which makes evolutionary algorithms a suitable candidate for optimizing the cluster labels. Genetic algorithms have been used in the literature for data clustering by optimizing cluster labels to reduce mean distance from cluster centers. Using only genetic operators and Euclidean distances, however, have resulted in limited success. In this paper, we propose to use a Genetic Algorithm framework for co-clustering data. What makes this contribution significant and distinctly unique is that we propose the use of a co-similarity objective function that uses multiple objective functions to seamlessly integrate the co-clustering framework into the optimization problem. Co-similarity matrices are intertwined row and column similarity matrices that are computed on the basis of each other. To the best of our knowledge, we are the first to propose the use of Genetic Algorithm to optimize co-similarity matrices for the co-clustering task. We conduct several experiments to analyse the performance of our proposed approach and compare them with numerous state-of-the-art clustering and co-clustering algorithms, on a variety of real world datasets. Our results show that the proposed approach significantly outperforms other clustering and co-clustering algorithms on all the datasets tested.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science Applications
Authors
, ,