Article ID Journal Published Year Pages File Type
528751 Journal of Visual Communication and Image Representation 2013 8 Pages PDF
Abstract

•We propose a q-Gaussian mixture model (q-GMM) for image and video semantic indexing.•The q-GMM has a parameter q that controls its tail-heaviness.•The q-GMM is more suitable than a GMM for representing images and videos.•Our proposed method outperformed bag-of-visual-words on PASCAL VOC and TRECVID datasets.

Gaussian mixture models which extend Bag-of-Visual-Words (BoW) to a probabilistic framework have been proved to be effective for image and video semantic indexing. Recently, the q-Gaussian distribution, derived from Tsallis statistics [11], has been shown to be useful for representing patterns in many complex systems in physics. We propose q-Gaussian mixture models (q-GMMs), mixture models of q-Gaussian distributions with a parameter q to control its tail-heaviness, for image and video semantic indexing [1]. The long-tailed distributions obtained for q>1q>1 are expected to effectively represent complexly correlated data, and hence, to improve robustness against outliers. The main improvements over our previous study [1] are q-GMM super-vector representation to efficiently compute the q-GMM kernel, and detailed experimental analysis showing accuracy and testing-cost comparison with recent kernel methods. Our proposed method outperformed BoW and achieved 49.42% and 10.90% in Mean Average Precision on the PASCAL VOC 2010 and the TRECVID 2010 Semantic Indexing, respectively.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,