کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
11030056 1646393 2018 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Learning deep spatiotemporal features for video captioning
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Learning deep spatiotemporal features for video captioning
چکیده انگلیسی
In this paper, we propose a novel automatic video captioning system which translates videos to sentences, utilizing a deep neural network that is composed of three building parts of convolutional and recurrent structure. That is, the first subnetwork operates as feature extractor of single frames. The second subnetwork is a three-stream network, capable of capturing spatial semantic information in the first stream, temporal semantic information in the second stream, and global video concept information in the third stream. The third subnetwork generates relevant textual captions using as input the spatiotemporal features of the second subnetwork. The experimental validation indicates the effectiveness of the proposed model, achieving superior performance over competitive methods.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 116, 1 December 2018, Pages 143-149
نویسندگان
, , ,