Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6939666 | Pattern Recognition | 2018 | 30 Pages |
Abstract
With the fast development of information acquisition, there is a rapid growth of multi-modality data, e.g., text, audio, image and video, in health care, multimedia retrieval and many other applications. Confronted with the challenges of clustering, classification or regression with multi-modality information, it is essential to effectively measure the distance or similarity between objects described with heterogeneous features. Metric learning, aimed at finding a task-oriented distance function, is a hot topic in machine learning. However, most existing algorithms lack efficiency for high-dimensional multi-modality tasks. In this work, we develop an effective and efficient metric learning algorithm for multi-modality data, i.e., Efficient Multi-modal Geometric Mean Metric Learning (EMGMML). The proposed algorithm learns a distinctive distance metric for each view by minimizing the distance between similar pairs while maximizing the distance between dissimilar pairs. To avoid overfitting, the optimization objective is regularized by symmetrized LogDet divergence. EMGMML is very efficient in that there is a closed-form solution for each distance metric. Experimental results show that the proposed algorithm outperforms the state-of-the-art metric learning methods in terms of both accuracy and efficiency.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Vision and Pattern Recognition
Authors
Jianqing Liang, Qinghua Hu, Pengfei Zhu, Wenwu Wang,