Article ID Journal Published Year Pages File Type
525552 Computer Vision and Image Understanding 2016 17 Pages PDF
Abstract

•Focused and incidental scene text images are processed in a separate manner.•Low rank matrix recovery is exploited to process the incidental scene text images.•A text confidence map was designed via fuzzy inference system.•The proposed algorithm handles both Latin and Farsi/Arabic scripts.•Farsi/Arabic scene texts at arbitrary orientations are localized for the first time.

In this paper a framework is proposed to localize both Farsi/Arabic and Latin scene texts with different sizes, fonts and orientations. First, candidate text regions are extracted via an MSER detector enhanced by weighted median filtering to adopt the low resolution texts. At the same time based on fuzzy inference system (FIS), the input image is classified into images with a focused text content and incidental scene text images which the image does not focus on the text content. For the focused scene text images the non-text candidates are filtered via an FIS. On the other hand, for the incidental scene text images apart from the FIS, an extra filtering algorithm based on low rank matrix recovery is proposed. Finally, a new approach based on the clustering, minimum area rectangle and radon transform techniques is proposed to create the single arbitrarily oriented text lines from the remaining text regions. To evaluate the proposed algorithm, we created a collection of natural images containing both Farsi/Arabic and Latin texts. Compared with the state-of-the-art methods, the proposed method achieves the best performance on our and Epshtein datasets and competitive performances on the ICDAR dataset.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,