Article ID Journal Published Year Pages File Type
532028 Pattern Recognition 2015 14 Pages PDF
Abstract

•A novel model, tensor representation learning based image patch analysis (TRL-IPA), is proposed for document understanding.•TRL-IPA is built on a general formulation of the convergent tensor representation learning (CTRL) algorithms.•The CTRL algorithms are theoretically guaranteed to converge to a local optimal solution of the learning problem.•Extensive experiments demonstrate the superiority of TRL-IPA over related vector and tensor representation based approaches.

In this paper, we introduce a novel framework for text identification and recognition, called tensor representation learning based image patch analysis (TRL-IPA). Unlike most of previous text identification approaches, which can only be applied to binarized images, TRL-IPA can be directly applied to gray level and color images. TRL-IPA is built on a general formulation of the convergent tensor representation learning (CTRL) algorithms. In the implementation of TRL-IPA, image patches are represented in the form of tensors, while low dimensional representations of these tensors are learned via a CTRL algorithm. To identify text regions in new coming document images, a random forest classifier is trained in the learned tensor subspace. Moreover, the TRL-IPA framework can be straightforwardly applied to recognition problems, such as handwritten digits recognition. We conducted extensive experiments on ancient Chinese, Arabic and Cyrillic document images, to evaluate TRL-IPA on text identification tasks. Experimental results demonstrate its effectiveness and robustness. In addition, recognition results on images of handwritten digits show its advantage over state-of-the-art vector and tensor representation based approaches.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,