Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6941351 | Pattern Recognition Letters | 2014 | 13 Pages |
Abstract
This paper presents an investigation into event detection in crowded scenes, where the event of interest co-occurs with other activities and only binary labels at the clip level are available. The proposed approach incorporates a fast feature descriptor from the MPEG domain, and a novel multiple instance learning (MIL) algorithm using sparse approximation and random sensing. MPEG motion vectors are used to build particle trajectories that represent the motion of objects in uniform video clips, and the MPEG DCT coefficients are used to compute a foreground map to remove background particles. Trajectories are transformed into the Fourier domain, and the Fourier representations are quantized into visual words using the K-Means algorithm. The proposed MIL algorithm models the scene as a linear combination of independent events, where each event is a distribution of visual words. Experimental results show that the proposed approaches achieve promising results for event detection compared to the state-of-the-art.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Vision and Pattern Recognition
Authors
Jingxin Xu, Simon Denman, Vikas Reddy, Clinton Fookes, Sridha Sridharan,