کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
6863546 | 1439515 | 2018 | 11 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Deep sequential fusion LSTM network for image description
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
هوش مصنوعی
پیش نمایش صفحه اول مقاله

چکیده انگلیسی
It is a challenging task to perform automatic image description, which aims to translate an image with visual information into natural language conforming to certain proper grammars and sentence structures. In this work, an optimal learning framework called deep sequential fusion based long short term memory network is designed. In the proposed framework, a layer-wise strategy is introduced into the generation process of recurrent neural network to increase the depth of language model for producing more abstract and discriminative features. Then, a deep supervision method is developed to enrich the model capacity with extra regularization. Moreover, the prediction scores from all of the auxiliary branches in the language model are employed to fuse the final decision output with product rule, which further makes use of the optimized model parameters and hence boosts the performance. The experimental results on two public benchmark datasets verify the effectiveness of the proposed approaches, with the consensus-based image description evaluation metric (CIDEr) being 103.4 on the MSCOCO dataset and the metric for evaluation of translation with explicit ordering (METEOR) reaching to 20.6 on the Flickr30K dataset.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 312, 27 October 2018, Pages 154-164
Journal: Neurocomputing - Volume 312, 27 October 2018, Pages 154-164
نویسندگان
Pengjie Tang, Hanli Wang, Sam Kwong,