Collaborative sparse representation leaning model for RGBD action recognition

Article ID	Journal	Published Year	Pages	File Type
4969266	Journal of Visual Communication and Image Representation	2017	11 Pages	PDF

Abstract

Multi-modalities action recognition becomes a hot research topic, and this paper proposes a collaborative sparse representation leaning model for RGB-D action recognition where RGB and depth information are adaptive fused. Specifically, dense trajectory feature is firstly extracted and Bag-of-Word (BoW) weight scheme is employed for RGB modality, and then for depth modality, the human pose representation model (HPM) and temporal modeling (TM) representation are utilized. Meanwhile, the collaborative reconstruction structure and corresponding objective functions for the multiple modalities are designed, and then the proposed model is collaboratively optimized which is used to discover the latent complementary information between RGB and depth data. Finally, the collaborative reconstruction error is employed as our classification scheme. Large scale experimental results on challenging and public DHA, M2I and Northwestern-UCLA action datasets show that the performances of our model on two modalities are much better than traditional sole modality, which can boost the performance of human action recognition by taking advance of complementary characteristics from both RGB and depth modalities.

Keywords

Dense trajectory Multi-modality