کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
526713 869205 2016 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Fully automatic person segmentation in unconstrained video using spatio-temporal conditional random fields *
ترجمه فارسی عنوان
تقسیم بندی خودکار فرد به طور خودکار در ویدیو بدون محدودیت با استفاده از زمینه های تصادفی شرطی فضایی-زمان *
کلمات کلیدی
تقسیم بندی شخص، تقسیم بندی ویدئو، زمینه تصادفی محض، جریان نوری، تمام اتوماتیک
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• Propose a method for automatic segmentation of people
• No constraints on camera or person in videos
• Learn model parameters without labeled segmented data
• Contribute labeled segmented video data for future research
• Proposed method performance favorable over other methods

The segmentation of objects and people in particular is an important problem in computer vision. In this paper, we focus on automatically segmenting a person from challenging video sequences in which we place no constraint on camera viewpoint, camera motion or the movements of a person in the scene. Our approach uses the most confident predictions from a pose detector as a form of anchor or keyframe stick figure prediction which helps guide the segmentation of other more challenging frames in the video. Since even state of the art pose detectors are unreliable on many frames –especially given that we are interested in segmentations with no camera or motion constraints –only the poses or stick figure predictions for frames with the highest confidence in a localized temporal region anchor further processing. The stick figure predictions within confident keyframes are used to extract color, position and optical flow features. Multiple conditional random fields (CRFs) are used to process blocks of video in batches, using a two dimensional CRF for detailed keyframe segmentation as well as 3D CRFs for propagating segmentations to the entire sequence of frames belonging to batches. Location information derived from the pose is also used to refine the results. Importantly, no hand labeled training data is required by our method. We discuss the use of a continuity method that reuses learnt parameters between batches of frames and show how pose predictions can also be improved by our model. We provide an extensive evaluation of our approach, comparing it with a variety of alternative grab cut based methods and a prior state of the art method. We also release our evaluation data to the community to facilitate further experiments. We find that our approach yields state of the art qualitative and quantitative performance compared to prior work and more heuristic alternative approaches.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Image and Vision Computing - Volume 51, July 2016, Pages 58–68
نویسندگان
, ,