کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
528532 869581 2013 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Human activity recognition in videos using a single example
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Human activity recognition in videos using a single example
چکیده انگلیسی


• A hierarchical structure for accurate video to video matching and event recognition
• Incorporating contextual information to the “bag of video words” framework
• Coding spatio-temporal compositions of video volumes by a probabilistic framework

This paper presents a novel approach for action recognition, localization and video matching based on a hierarchical codebook model of local spatio-temporal video volumes. Given a single example of an activity as a query video, the proposed method finds similar videos to the query in a target video dataset. The method is based on the bag of video words (BOV) representation and does not require prior knowledge about actions, background subtraction, motion estimation or tracking. It is also robust to spatial and temporal scale changes, as well as some deformations. The hierarchical algorithm codes a video as a compact set of spatio-temporal volumes, while considering their spatio-temporal compositions in order to account for spatial and temporal contextual information. This hierarchy is achieved by first constructing a codebook of spatio-temporal video volumes. Then a large contextual volume containing many spatio-temporal volumes (ensemble of volumes) is considered. These ensembles are used to construct a probabilistic model of video volumes and their spatio-temporal compositions. The algorithm was applied to three available video datasets for action recognition with different complexities (KTH, Weizmann, and MSR II) and the results were superior to other approaches, especially in the case of a single training example and cross-dataset1 action recognition.

Figure optionsDownload high-quality image (294 K)Download as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Image and Vision Computing - Volume 31, Issue 11, November 2013, Pages 864–876
نویسندگان
, ,