Article ID Journal Published Year Pages File Type
8960221 Pattern Recognition 2019 38 Pages PDF
Abstract
In this paper, we propose an unsupervised representation method to learn principal orientations and residual descriptor (PORD) for action recognition. Our PORD aims to learn the statistic principal orientations and to represent the local features of action videos with residual values. The existing hand-crafted feature based methods require high prior knowledge and lack of the ability to represent the distribution of features of the dataset. Most of the deep learned feature based methods are data adaptive, but they do not consider the projection orientations of features nor the loss of locally aggregated descriptors of the quantization. We propose a method of principal orientations and residual descriptor considering that the principal orientations reflect the distribution of local features in the dataset and the residual of projection contains discriminative information of local features. Moreover, we propose a multi-modality PORD method by reducing the modality gap of the RGB channels and the depth channel at the feature level to make our method applicable to RGB-D action recognition. To evaluate the performance, we conduct experiments on five challenging action datasets: Hollywood2, UCF101, HMDB51, MSRDaily, and MSR-Pair. The results show that our method is competitive with the state-of-the-art methods.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,