Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6941153 | Pattern Recognition Letters | 2015 | 9 Pages |
Abstract
In this paper, we focus on the problem of 3D dynamic (4D) facial expression recognition. While traditional methods rely on building deformation models on high-resolution 3D meshes, our approach works directly on low-resolution RGB-D sequences; this feature allows us to apply our algorithm to videos retrieved by widespread and standard low-resolution RGB-D sensors, such as Kinect. After preprocessing both RGB and depth image sequences, sparse features are learned from spatio-temporal local cuboids. Conditional Random Fields classifier is then employed for training and classification. The proposed system is fully-automatic and achieves superior results on three low-resolution datasets built from the 4D facial expression recognition dataset - BU-4DFE. Extensive evaluations of our approach and comparisons with state-of-the-art methods are presented.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Vision and Pattern Recognition
Authors
Jie Shao, Ilaria Gori, Shaohua Wan, J.K. Aggarwal,