Pseudo low rank video representation

Article ID	Journal	Published Year	Pages	File Type
11002863	Pattern Recognition	2019	38 Pages	PDF

Abstract

Action recognition plays a fundamental role in computer vision and has drawn growing attention recently. This paper addresses this issue conditioned on extreme Low Resolution (abbreviated as eLR). Generally, eLR video is often susceptible to noise, thus extracting a robust representation is of great challenge. Besides, due to the limitation of video resolution, eLR video cannot be cropped or resized randomly, then it is inevitably complicated to design and to train a deep network for eLR video. This paper proposes a novel network for robust video representation by employing pseudo tensor low rank regularization. A new Video Low Rank Representation model (named VLRR) is first proposed to recover the inherent robust component of a given video, and then the recovered term is introduced to a convolutional Network (denoted pLRN) as an auxiliary pseudo Low Rank guidance. Benefitting from the auxiliary guidance, pLRN can learn an approximate low rank term end-to-end. Besides, this paper presents a new initialization strategy for eLR recognition neTwork based on Tensor factorization (dubbed TenneT). TenneT is data-driven and learns the convolutional kernels totally from the video distribution while without any back-propagation. It outperforms random initialization both in speed and accuracy. Experiments on benchmark datasets demonstrate the effectiveness and superiority of the proposed method.

Keywords

Data driven Action recognition Low resolution