Article ID Journal Published Year Pages File Type
533253 Pattern Recognition 2015 12 Pages PDF
Abstract

•Multimodal learning for facial expression recognition (FER) is proposed.•The first attempt to do FER from the joint representation of texture and landmarks.•The multimodal structure combines feature extraction and classification together.•Structured regularization is used to enforce the sparsity of different modalities.

In this paper, multimodal learning for facial expression recognition (FER) is proposed. The multimodal learning method makes the first attempt to learn the joint representation by considering the texture and landmark modality of facial images, which are complementary with each other. In order to learn the representation of each modality and the correlation and interaction between different modalities, the structured regularization (SR) is employed to enforce and learn the modality-specific sparsity and density of each modality, respectively. By introducing SR, the comprehensiveness of the facial expression is fully taken into consideration, which can not only handle the subtle expression but also perform robustly to different input of facial images. With the proposed multimodal learning network, the joint representation learning from multimodal inputs will be more suitable for FER. Experimental results on the CK+ and NVIE databases demonstrate the superiority of our proposed method.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,