کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4969358 1449932 2017 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic image annotation based on Gaussian mixture model considering cross-modal correlations
ترجمه فارسی عنوان
حاشیه نویسی تصویر اتوماتیک بر اساس مدل ترکیبی گاوسی با توجه به همبستگی های متقابل
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
Automatic image annotation has been an active topic of research in the field of computer vision and pattern recognition for decades. In this paper, we present a new method for automatic image annotation based on Gaussian mixture model (GMM) considering cross-modal correlations. To be specific, we first employ GMM fitted by the rival penalized expectation-maximization (RPEM) algorithm to estimate the posterior probabilities of each annotation keyword. Next, a label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity by seamlessly integrating the information from both image low level visual features and high level semantic concepts together, which can effectively avoid the phenomenon that different images with the same candidate annotations would obtain the same refinement results. Followed by the rank-two relaxation heuristics over the built label similarity graph is applied to further mine the correlation of the candidate annotations so as to capture the refining annotation results, which plays a crucial role in the semantic based image retrieval. The main contributions of this work can be summarized as follows: (1) Exploiting GMM that is trained by the RPEM algorithm to capture the initial semantic annotations of images. (2) The label similarity graph is constructed by a weighted linear combination of label similarity and visual similarity of images associated with the corresponding labels. (3) Refining the candidate set of annotations generated by the GMM through solving the max-bisection based on the rank-two relaxation algorithm over the weighted label graph. Compared to the current competitive model SGMM-RW, we can achieve significant improvements of 4% and 5% in precision, 6% and 9% in recall on the Corel5k and Mirflickr25k, respectively.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Visual Communication and Image Representation - Volume 44, April 2017, Pages 50-60
نویسندگان
, ,