Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
532463 | Journal of Visual Communication and Image Representation | 2014 | 10 Pages |
•We represent the video with spatial and temporal sets of graphs.•We extract frequent spatial and temporal sub-graphs from the spatial and the temporal graph databases.•The video is indexed with a combination of a histogram of the frequent spatial sub-graphs and a histogram of the temporal sub-graphs.•Our graph-based approach has shown it efficiency for human action recognition.
Due to the exponential growth of the video data stored and uploaded in the Internet websites especially YouTube, an effective analysis of video actions has become very necessary. In this paper, we tackle the challenging problem of human action recognition in realistic video sequences. The proposed system combines the efficiency of the Bag-of-visual-Words strategy and the power of graphs for structural representation of features. It is built upon the commonly used Space–Time Interest Points (STIP) local features followed by a graph-based video representation which models the spatio-temporal relations among these features. The experiments are realized on two challenging datasets: Hollywood2 and UCF YouTube Action. The experimental results show the effectiveness of the proposed method.