Article ID Journal Published Year Pages File Type
4969266 Journal of Visual Communication and Image Representation 2017 11 Pages PDF
Abstract
Multi-modalities action recognition becomes a hot research topic, and this paper proposes a collaborative sparse representation leaning model for RGB-D action recognition where RGB and depth information are adaptive fused. Specifically, dense trajectory feature is firstly extracted and Bag-of-Word (BoW) weight scheme is employed for RGB modality, and then for depth modality, the human pose representation model (HPM) and temporal modeling (TM) representation are utilized. Meanwhile, the collaborative reconstruction structure and corresponding objective functions for the multiple modalities are designed, and then the proposed model is collaboratively optimized which is used to discover the latent complementary information between RGB and depth data. Finally, the collaborative reconstruction error is employed as our classification scheme. Large scale experimental results on challenging and public DHA, M2I and Northwestern-UCLA action datasets show that the performances of our model on two modalities are much better than traditional sole modality, which can boost the performance of human action recognition by taking advance of complementary characteristics from both RGB and depth modalities.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,