Article ID Journal Published Year Pages File Type
536130 Pattern Recognition Letters 2009 7 Pages PDF
Abstract

The representation and recognition of complex semantic events (e.g. illegal parking, stealing objects) is a challenging task for high-level understanding of video sequence. To solve this problem, an attribute graph grammar for events modeling is studied in this paper. This grammar models the variability of semantic events by a set of meaningful “event components” with the spatio-temporal constraints. The event components are defined manually according to their semantic meaning, and further decomposed into atomic event primitives. These event primitives are learned on a object-trajectory table that describes mobile object attributes (location, velocity, and visibility) in a video sequence. A dictionary of temporal and spatial relations are defined to constrain the event primitives. With this representation, one observed event can be parsed into an “event parse graph”, and all possible variability of one event can be modeled into an “event And–Or graph”, in a syntactic way. The probability model of an “event And–Or graph” can be learned on a set of annotated event instances, and given a learned event And–Or graph, a Gibbs sampling scheme is utilized for inference on a testing video. In the experiments, we test events recognition performance of the proposed on both real indoor and outdoor videos and show quantitative recognition rate on the public LHI dataset.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,