Article ID Journal Published Year Pages File Type
11002863 Pattern Recognition 2019 38 Pages PDF
Abstract
Action recognition plays a fundamental role in computer vision and has drawn growing attention recently. This paper addresses this issue conditioned on extreme Low Resolution (abbreviated as eLR). Generally, eLR video is often susceptible to noise, thus extracting a robust representation is of great challenge. Besides, due to the limitation of video resolution, eLR video cannot be cropped or resized randomly, then it is inevitably complicated to design and to train a deep network for eLR video. This paper proposes a novel network for robust video representation by employing pseudo tensor low rank regularization. A new Video Low Rank Representation model (named VLRR) is first proposed to recover the inherent robust component of a given video, and then the recovered term is introduced to a convolutional Network (denoted pLRN) as an auxiliary pseudo Low Rank guidance. Benefitting from the auxiliary guidance, pLRN can learn an approximate low rank term end-to-end. Besides, this paper presents a new initialization strategy for eLR recognition neTwork based on Tensor factorization (dubbed TenneT). TenneT is data-driven and learns the convolutional kernels totally from the video distribution while without any back-propagation. It outperforms random initialization both in speed and accuracy. Experiments on benchmark datasets demonstrate the effectiveness and superiority of the proposed method.
Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , , ,