Image captioning with triple-attention and stack parallel LSTM

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
11012485	1798846	2018	16 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

LSTM Attention - توجه CNN - سی ان ان Deep learning - یادگیری عمیق

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش صفحه اول مقاله

Image captioning with triple-attention and stack parallel LSTM

چکیده انگلیسی

Image captioning aims to describe the content of images with a sentence. It is a natural way for people to express their understanding, but a challenging and important task from the view of image understanding. In this paper, we propose two innovations to improve the performance of such a sequence learning problem. First, we give a new attention method named triple attention (TA-LSTM) which can leverage the image context information at every stage of LSTM. Then, we redesign the structure of basic LSTM, in which not only the stacked LSTM but also the paralleled LSTM are adopted, called as PS-LSTM. In this structure, we not only use the stack LSTM but also use the parallel LSTM to achieve the improvement of the performance compared with the normal LSTM. Through this structure, the proposed model can ensemble more parameters on single model and has ensemble ability itself. Through numerical experiments, on the public available MSCOCO dataset, our final TA-PS-LSTM model achieves comparable performance with some state-of-the-art methods.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 319, 30 November 2018, Pages 55-65

نویسندگان

Xinxin Zhu, Lixiang Li, Jing Liu, Ziyi Li, Haipeng Peng, Xinxin Niu,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Image captioning with triple-attention and stack parallel LSTM

دسترسی سریع

ارتباط

English Website