کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
529833 869716 2013 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Fast human action classification and VOI localization with enhanced sparse coding
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Fast human action classification and VOI localization with enhanced sparse coding
چکیده انگلیسی

Sparse coding which encodes the natural visual signal into a sparse space for visual codebook generation and feature quantization, has been successfully utilized for many image classification applications. However, it has been seldom explored for many video analysis tasks. In particular, the increased complexity in characterizing the visual patterns of diverse human actions with both the spatial and temporal variations imposes more challenges to the conventional sparse coding scheme. In this paper, we propose an enhanced sparse coding scheme through learning discriminative dictionary and optimizing the local pooling strategy. Localizing when and where a specific action happens in realistic videos is another challenging task. By utilizing the sparse coding based representations of human actions, this paper further presents a novel coarse-to-fine framework to localize the Volumes of Interest (VOIs) for the actions. Firstly, local visual features are transformed into the sparse signal domain through our enhanced sparse coding scheme. Secondly, in order to avoid exhaustive scan of entire videos for the VOI localization, we extend the Spatial Pyramid Matching into temporal domain, namely Spatial Temporal Pyramid Matching, to obtain the VOI candidates. Finally, a multi-level branch-and-bound approach is developed to refine the VOI candidates. The proposed framework is also able to avoid prohibitive computations in local similarity matching (e.g., nearest neighbors voting). Experimental results on both two popular benchmark datasets (KTH and YouTube UCF) and the widely used localization dataset (MSR) demonstrate that our approach reduces computational cost significantly while maintaining comparable classification accuracy to that of the state-of-the-art methods.


► We propose an enhanced sparse coding scheme by a discriminative learning method.
► The sparse coding scheme is further enhanced by optimized local pooling strategy.
► A multilevel branch-and-bound approach is developed to improve the efficiency.
► We evaluate our solutions on popular benchmark human action datasets.
► Our approach achieves efficiency improvement and maintains effectiveness.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Visual Communication and Image Representation - Volume 24, Issue 2, February 2013, Pages 127–136
نویسندگان
, , , ,