Learning semantic concepts from image database with hybrid generative/discriminative approach

Article ID	Journal	Published Year	Pages	File Type
380755	Engineering Applications of Artificial Intelligence	2013	10 Pages	PDF

Abstract

•We present continuous PLSA and derive corresponding EM algorithm.•We propose a hybrid generative/discriminative approach to learn visual concepts.•Our approach can integrate correlation between labels when it classifies images.•High accuracy and superior effectiveness of our approach are reported.

Semantic gap has become a bottleneck of content-based image retrieval in recent years. In order to bridge the gap and improve the retrieval performance, automatic image annotation has emerged as a crucial problem. In this paper, a hybrid approach is proposed to learn the semantic concepts of images automatically. Firstly, we present continuous probabilistic latent semantic analysis (PLSA) and derive its corresponding Expectation–Maximization (EM) algorithm. Continuous PLSA assumes that elements are sampled from a multivariate Gaussian distribution given a latent aspect, instead of a multinomial one in traditional PLSA. Furthermore, we propose a hybrid framework which employs continuous PLSA to model visual features of images in generative learning stage and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage. Therefore, the framework can learn the correlations between features as well as the correlations between words. Since the hybrid approach combines the advantages of generative and discriminative learning, it can predict semantic annotation precisely for unseen images. Finally, we conduct the experiments on three baseline datasets and the results show that our approach outperforms many state-of-the-art approaches.

Keywords

Image retrieval Automatic image annotation Semantic gap Hybrid framework