Kernel-driven similarity learning

Article ID	Journal	Published Year	Pages	File Type
4947002	Neurocomputing	2017	10 Pages	PDF

Abstract

Similarity measure is fundamental to many machine learning and data mining algorithms. Predefined similarity metrics are often data-dependent and sensitive to noise. Recently, data-driven approach which learns similarity information from data has drawn significant attention. The idea is to represent a data point by a linear combination of all (other) data points. However, it is often the case that more complex relationships beyond linear dependencies exist in the data. Based on the well known fact that kernel trick can capture the nonlinear structure information, we extend this idea to kernel spaces. Nevertheless, such an extension brings up another issue: its algorithm performance is largely determined by the choice of kernel, which is often unknown in advance. Therefore, we further propose a multiple kernel-based learning method. By doing so, our model can learn both linear and nonlinear similarity information, and automatically choose the most suitable kernel. As a result, our model is capable of learning complete similarity information hidden in data set. Comprehensive experimental evaluations of our algorithms on clustering and recommender systems demonstrate its superior performance compared to other state-of-the-art methods. This performance also shows the great potential of our proposed algorithm for other possible applications.

Keywords

Nonlinear relation Similarity measure Clustering kernel method Recommender systems Sparse representation Multiple kernel learning