Article ID Journal Published Year Pages File Type
848879 Optik - International Journal for Light and Electron Optics 2014 5 Pages PDF
Abstract

Complex human activities in natural videos are often composed of several atomic-level actions organized hierarchically. We should not only consider the appearance variability of these action units, but also model the spatiotemporal relationships between them when recognizing such high-level complex activities. In this paper, we focus on the problem of recognition of complex human activities in an example-based video retrieval framework and propose a new method based on hierarchical feature-graph matching. A video depicting an activity is represented as a high-level feature graph (HLFG), and each node of the HLFG is a mid-level feature graph (MLFG) constructed on a local collection of spatiotemporal interest points. MLFG, the first level of our two-level graph structure, describes the local feature contents and spatiotemporal arrangements of interest points. HLFG, the second level, describes the appearance variability and spatiotemporal arrangements of atomic-level actions in a way. Final recognition is accomplished by matching the HLFGs of the query and test videos, and matching two HLFGs involves matching the MLFGs between them. We use an efficient spectral method to solve these two graph-matching problems. Our method does not require any preprocessing and gives reasonable results with even a small number of query examples. We evaluate our approach with one publicly available complex human activity dataset and achieve results comparable to other systems that have studied this problem.

Related Topics
Physical Sciences and Engineering Engineering Engineering (General)
Authors
, , ,