Article ID Journal Published Year Pages File Type
527751 Computer Vision and Image Understanding 2013 16 Pages PDF
Abstract

In this paper, a novel generalized framework of activity representation and recognition based on a ‘string of feature graphs (SFG)’ model is introduced. The proposed framework represents a visual activity as a string of feature graphs, where the string elements are initially matched using a graph-based spectral technique, followed by a dynamic programming scheme for matching the complete strings. The framework is motivated by success of time sequence analysis approaches in speech recognition, but modified in order to capture the spatio-temporal properties of individual actions, the interactions between objects, and speed of activity execution. This framework can be adapted to various spatio-temporal motion features, and we show details on using STIP features and track features. Furthermore, we show how this SFG model can be embedded within a switched dynamical system (SDS) that is able to automatically choose the most efficient features for a particular video segment. This allows us to analyze a variety of activities in natural videos in a computationally efficient manner. Experimental results on the basic SFG model as well as its integration with the SDS are shown on some of the most challenging multi-object datasets available to the activity analysis community.

► We propose a generalized framework of activity representation and recognition based on a ‘string of feature graphs (SFG)’ model. ► We examine the performance of the ‘string of feature graphs (SFG)’ model based on STIP features and track features. ► We embed the SFG model within a switched dynamical system (SDS) to do the adaptive feature selection. ► The proposed adaptive feature selection scheme reduces the computation complexity while improving the recognition accuracy.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,