Learning action patterns in difference images for efficient action recognition

Article ID	Journal	Published Year	Pages	File Type
407035	Neurocomputing	2014	9 Pages	PDF

Abstract

•A local temporal self-similarity (LTSS) is proposed to learn action patterns directly from difference images.•The presented framework is executed without requiring detecting/locating human body in a given video sequence.•The framework can be executed very fast with achieving a satisfactory recognition performance.

A new framework is presented for single-person oriented action recognition. This framework does not require detection/location of bounding boxes of human body nor motion estimation in each frame. The novel descriptor/pattern for action representation is learned with local temporal self-similarities (LTSSs) derived directly from difference images. The bag-of-words framework is then employed for action classification taking advantages of these descriptors. We investigated the effectiveness of the framework on two public human action datasets: the Weizmann dataset and the KTH dataset. In the Weizmann dataset, the proposed framework achieves a performance of 95.6% in the recognition rate and that of 91.1% in the KTH dataset, both of which are competitive with those of state-of-the-art approaches, but it has a high potential to achieve a faster execution performance.

Keywords

Bag-of-Words