Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
526206 | Computer Vision and Image Understanding | 2011 | 14 Pages |
In this paper, we address the recovery of human 2D postures from monocular image sequences. We propose a novel pose estimation framework which is based on the integration of probabilistic bottom-up and top-down processes which iteratively refine each other: foreground pixels are segmented using image cues whereas a hierarchical 2D body model fitting constraints body partitions. Its main advantages are twofold. First, the presented framework is activity-independent since it does not rely on learning any motion model. Secondly, we propose a confidence score indicating the quality of each estimated pose. Our study also reveals significant discrepancy between ground truth joint positions according to whether they are defined by humans or a motion capture system. Quantitative and qualitative results are presented on a variety of video sequences to validate our approach.
Research highlights► Combined bottom-up/top-down processes allow activity independent pose estimation. ► Probabilistic Gaussian modelling produces confidence score for each pose estimate. ► Comparison between human and MoCap-based ground truth reveals large discrepancy.