Article ID Journal Published Year Pages File Type
4977416 Signal Processing 2018 13 Pages PDF
Abstract
The low-cost depth cameras have facilitated the research of human action recognition in the last decades. Despite various approaches have been presented to improve the recognition accuracy, they are rarely extended to online recognition task in clutter scenes. In this paper, we propose an effective approach, which is insensitive to various temporal duration and adequate for complex background, for human action recognition using depth sequences. By embedding the skeleton information into depth maps, the human body is partitioned to a set of motion parts, which could take account of the geometrical structure of human body and contribute to the recognition task in complex background. A local spatio-temporal scaled pyramid is applied to obtain compact local feature representation. The simplified Fisher vector encoding method is introduced to aggregate local coarse features into a discriminative representation with unified form. The proposed approach is validated on three public benchmark datasets, i.e., MSR Daily Activity 3D, MSR Action Pairs, and MSR Action 3D. The experimental results demonstrate the effectiveness and feasibility of proposed approach for real-time applications.
Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , , ,