Article ID Journal Published Year Pages File Type
530210 Pattern Recognition 2015 10 Pages PDF
Abstract

•New OCR-concept designed for the requirements of historic prints.•Pattern matching with on-the-fly generated patterns.•Integration of a RIP into an OCR software.•Outperforms established OCR softwares especially for out-of-the-ordinary fonts.•Consistently good hit rates for arbitrary fonts.

In this paper we present a new OCR-concept designed for the requirements of historic prints in the context of mass-digitizations. The core part is the glyph recognition, based on pattern matching with patterns that are derived from computer font glyphs and are generated on-the-fly. The classification of a sample is organized as a search process for the most similar glyph pattern. This results in consistently good hit rates for arbitrary fonts without any training. In particular, we investigate the performance of our prototype in comparison to popular commercially available OCR-software.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,