Article ID Journal Published Year Pages File Type
405897 Neurocomputing 2016 9 Pages PDF
Abstract

The scale-invariant feature transform (SIFT) feature plays a very important role in multimedia content analysis, such as near-duplicate image and video retrieval. However, the storage and query costs of SIFT become unbearable for large-scale databases. In this paper, SIFT features are robustly encoded with temporal information by tracking the SIFT to generate temporal-concentration SIFT (TCSIFT), which highly compresses the quantity of local features to reduce visual redundancy, and keeps the advantages of SIFT as much as possible at the same time. On the basis of TCSIFT, a novel framework for large-scale video copy retrieval is proposed in which the processes of retrieval and validation are implemented at the feature and frame level. Experimental results for two different datasets, i.e., CC_WEB_VIDEO and TRECVID, demonstrate that our method can yield comparable accuracy, compact storage size, and more efficient execution time, as well as adapt to various video transformations.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , , ,