کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4970455 1450124 2017 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Action Recognition from Still Images Based on Deep VLAD Spatial Pyramids
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Action Recognition from Still Images Based on Deep VLAD Spatial Pyramids
چکیده انگلیسی
The recognition of human actions in images is a challenging task in computer vision. In many applications, actions can be exploited as mid-level semantic features for high level tasks. Actions often appear in fine-grained categorization, where the differences between two categories are small. Recently, deep learning approaches have achieved great success in many vision tasks, e.g., image classification, object detection, and attribute and action recognition. Also, the Bag-of-Visual-Words (BoVW) and its extensions, e.g., Vector of Locally Aggregated Descriptors (VLAD) encoding, have proved to be powerful in identifying global contextual information. In this paper, we propose a new action recognition scheme by combining the powerful feature representational capabilities of Convolutional Neural Networks (CNNs) with the VLAD encoding scheme. Specifically, we encode the CNN features of image patches generated by a region proposal algorithm with VLAD and subsequently represent an image by the compact code, which not only captures the more fine-grained properties of the images but also contains global contextual information. To identify the spatial information, we exploit the spatial pyramid representation and encode CNN features inside each pyramid. Experiments have verified that the proposed schemes are not only suitable for action recognition but also applicable to more general recognition tasks such as attribute classification. The proposed scheme is validated with four benchmark datasets with competitive mAP results of 88.5% on the Stanford 40 Action dataset, 81.3% on the People Playing Musical Instrument dataset, 90.4% on the Berkeley Attributes of People dataset and 74.2% on the 27 Human Attributes dataset.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Signal Processing: Image Communication - Volume 54, May 2017, Pages 118-129
نویسندگان
, , ,