Article ID Journal Published Year Pages File Type
532463 Journal of Visual Communication and Image Representation 2014 10 Pages PDF
Abstract

•We represent the video with spatial and temporal sets of graphs.•We extract frequent spatial and temporal sub-graphs from the spatial and the temporal graph databases.•The video is indexed with a combination of a histogram of the frequent spatial sub-graphs and a histogram of the temporal sub-graphs.•Our graph-based approach has shown it efficiency for human action recognition.

Due to the exponential growth of the video data stored and uploaded in the Internet websites especially YouTube, an effective analysis of video actions has become very necessary. In this paper, we tackle the challenging problem of human action recognition in realistic video sequences. The proposed system combines the efficiency of the Bag-of-visual-Words strategy and the power of graphs for structural representation of features. It is built upon the commonly used Space–Time Interest Points (STIP) local features followed by a graph-based video representation which models the spatio-temporal relations among these features. The experiments are realized on two challenging datasets: Hollywood2 and UCF YouTube Action. The experimental results show the effectiveness of the proposed method.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,