Image annotation based on feature fusion and semantic similarity

Article ID	Journal	Published Year	Pages	File Type
407685	Neurocomputing	2015	14 Pages	PDF

Abstract

The present study developed a new algorithm based on multi-feature fusion and semantic similarity for image annotation. To study the relationship between feature distance and semantic similarity, the feature-annotation space is transformed into a distance–similarity space. Therefore, the intrinsic statistical relationship between visual and semantic information can be studied. Re-scaling is necessary to fuse multiple features. The potential multimodal properties of image features mean that traditional feature re-scaling is based on boundary values, which are sensitive to outliers. Our proposed distance re-scaling method overcomes the drawbacks of using statistical information. In the distance space, each distance vector of an image pair is treated as a sample. The anisotropic Gaussian distribution is transformed into an isotropic Gaussian with a mean of zero and standard variance. The nearest-distance images are retrieved from this space. To select features of not only low distance correlations but also a high semantic correlation, the visual and semantic relationship is studied using canonical correlation analysis. The canonical correlation coefficient of the similarity and distance is found to connect closely with the annotation score of the feature. Experiments showed that the proposed multi-feature fusion method removed the effects of scale and the correlations of feature distances, so it could represent the total distance better and find the nearest neighbors. We tested our method using the Corel5K, IAPR-TC12, ESP Game, and VOC PASCAL datasets, which showed that it outperformed existing approaches.

Keywords

Linear transform Canonical correlation analysis Semantic similarity nearest neighbor Feature fusion