Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
558453 | Digital Signal Processing | 2011 | 12 Pages |
This paper considers the intent inference problem in a video surveillance application involving person tracking. The video surveillance is carried out through a distributed network of cameras with large non-overlapping areas. The target dynamics are formulated using a novel grammar-modulated state space model. We identify and model trajectory shapes of interest using the modeling framework of stochastic context-free grammars. The inference of such grammar models is performed using a Bayesian estimation algorithm called the Earley–Stolcke parser. The intent inference procedure is proposed at a meta-level, which allows the use of conventional trackers at the base level. The use of a suitable fusion scheme shows that intent inference using stochastic context-free grammars in a distributed camera network performs suitably even in the presence of large observation gaps and noisy observations.