Article ID Journal Published Year Pages File Type
6939666 Pattern Recognition 2018 30 Pages PDF
Abstract
With the fast development of information acquisition, there is a rapid growth of multi-modality data, e.g., text, audio, image and video, in health care, multimedia retrieval and many other applications. Confronted with the challenges of clustering, classification or regression with multi-modality information, it is essential to effectively measure the distance or similarity between objects described with heterogeneous features. Metric learning, aimed at finding a task-oriented distance function, is a hot topic in machine learning. However, most existing algorithms lack efficiency for high-dimensional multi-modality tasks. In this work, we develop an effective and efficient metric learning algorithm for multi-modality data, i.e., Efficient Multi-modal Geometric Mean Metric Learning (EMGMML). The proposed algorithm learns a distinctive distance metric for each view by minimizing the distance between similar pairs while maximizing the distance between dissimilar pairs. To avoid overfitting, the optimization objective is regularized by symmetrized LogDet divergence. EMGMML is very efficient in that there is a closed-form solution for each distance metric. Experimental results show that the proposed algorithm outperforms the state-of-the-art metric learning methods in terms of both accuracy and efficiency.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,