Article ID Journal Published Year Pages File Type
525699 Computer Vision and Image Understanding 2015 18 Pages PDF
Abstract

•A hierarchical temporal model is used to estimate head pose in real-world videos.•Head pose classification in (un)constrained databases shows superior performance.•Proposed model is used to classify facial traits in real-world videos.•Trait classification with and without using the estimated pose angle is performed.•Facial trait classification using the proposed model show superior performance.

Recently, head pose estimation in real-world environments has been receiving attention in the computer vision community due to its applicability to a wide range of contexts. However, this task still remains as an open problem because of the challenges presented by real-world environments. The focus of most of the approaches to this problem has been on estimation from single images or video frames, without leveraging the temporal information available in the entire video sequence. Other approaches frame the problem in terms of classification into a set of very coarse pose bins. In this paper, we propose a hierarchical graphical model that probabilistically estimates continuous head pose angles from real-world videos, by leveraging the temporal pose information over frames. The proposed graphical model is a general framework, which is able to use any type of feature and can be adapted to any facial classification task. Furthermore, the framework outputs the entire pose distribution for a given video frame. This permits robust temporal probabilistic fusion of pose information over the video sequence, and also probabilistically embedding the head pose information into other inference tasks. Experiments on large, real-world video sequences reveal that our approach significantly outperforms alternative state-of-the-art pose estimation methods. The proposed framework is also evaluated on gender and facial hair estimation. By incorporating pose information into the proposed hierarchical temporal graphical mode, superior results are achieved for attribute classification tasks.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,