Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4969266 | Journal of Visual Communication and Image Representation | 2017 | 11 Pages |
Abstract
Multi-modalities action recognition becomes a hot research topic, and this paper proposes a collaborative sparse representation leaning model for RGB-D action recognition where RGB and depth information are adaptive fused. Specifically, dense trajectory feature is firstly extracted and Bag-of-Word (BoW) weight scheme is employed for RGB modality, and then for depth modality, the human pose representation model (HPM) and temporal modeling (TM) representation are utilized. Meanwhile, the collaborative reconstruction structure and corresponding objective functions for the multiple modalities are designed, and then the proposed model is collaboratively optimized which is used to discover the latent complementary information between RGB and depth data. Finally, the collaborative reconstruction error is employed as our classification scheme. Large scale experimental results on challenging and public DHA, M2I and Northwestern-UCLA action datasets show that the performances of our model on two modalities are much better than traditional sole modality, which can boost the performance of human action recognition by taking advance of complementary characteristics from both RGB and depth modalities.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Vision and Pattern Recognition
Authors
Z. Gao, S.H. Li, Y.J. Zhu, C. Wang, H. Zhang,