Towards a dynamic expression recognition system under facial occlusion

Article ID	Journal	Published Year	Pages	File Type
534090	Pattern Recognition Letters	2012	11 Pages	PDF

Abstract

Facial occlusion is a challenging research topic in facial expression recognition (FER). This has resulted in the need to develop some interesting facial representations and occlusion detection methods in order to extend the FER to uncontrolled environments. It should be noted that most of the previous work focuses on these two issues separately, and on static images. We are thus motivated to propose a complete system consisting of facial representations, occlusion detection, and multiple feature fusion in video sequences. For achieving a robust facial representation, we propose an approach deriving six feature vectors from eyes, nose and mouth components to form a facial representation. These features with temporal cues are generated by the dynamic texture and structural shape feature descriptors. On the other hand, occlusion detection is still mainly realized by the traditional classifiers or model comparison. Recently, sparse representation has been proposed as an efficient method against occlusion, while it is correlated with facial identity in FER, unless using an appropriate facial representation. Thus, we present an evaluation demonstrating that the proposed facial representation is independent of facial identity. Inspired by Mercier et al. (2007), we then exploit the use of the sparse representation and residual statistics to occlusion detection of the image sequences. As concatenating six feature vectors into one causes the curse of dimensionality, we propose multiple feature fusion consisting of fusion module and weight learning. Experimental results on the Extended Cohn–Kanade database and its simulated database demonstrate that our framework outperforms the state-of-the-art methods for FER in normal videos, and especially, in partial occlusion videos.

► We propose a framework for handling the facial expression recognition under the facial occlusion of facial image sequences. ► Component-based facial representations and occlusion detection are robust to partial occlusion. ► Multiple feature fusion with weight learning boosts the performance of facial expression recognition.

Keywords

Multiple feature fusion Image sequences Component