Joint sparse representation for video-based face recognition

Article ID	Journal	Published Year	Pages	File Type
409996	Neurocomputing	2014	7 Pages	PDF

Abstract

•Propose a reconstruction-based method for video-based face recognition, where all images from a probe video clip are jointly recovered.•Use two sparse constraints to make representation more credible.•Develop an efficient optimization algorithm.•Get a more competitive performance on the challenging dataset YTC.

Video-based Face Recognition (VFR) can be converted into the problem of measuring the similarity of two image sets, where the examples from a video clip construct one image set. In this paper, we consider face images from each clip as an ensemble and formulate VFR into the Joint Sparse Representation (JSR) problem. In JSR, to adaptively learn the sparse representation of a probe clip, we simultaneously consider the class-level and atom-level sparsity, where the former structurizes the enrolled clips using the structured sparse regularizer (i.e. , L2,1L2,1-norm) and the latter seeks for a few related examples using the sparse regularizer (i.e. , L1-normL1-norm). Besides, we also consider to pre-train a compacted dictionary to accelerate the algorithm, and impose the non-negativity constraint on the recovered coefficients to encourage positive correlations of the representation. The classification is ruled in favor of the class that has the lowest accumulated reconstruction error. We conduct extensive experiments on three real-world databases: Honda, MoBo and YouTube Celebrities (YTC). The results demonstrate that our method is more competitive than those state-of-the-art VFR methods.

Keywords

Face recognition Accelerated proximal gradient Sparse representation Structured sparse representation