Article ID Journal Published Year Pages File Type
536159 Pattern Recognition Letters 2016 8 Pages PDF
Abstract

•A novel method for baseline detection of multi-lingual multi-oriented text lines.•To our knowledge, this is the first baseline detection method for multi-turn text lines.•The method uses machine learning along with rotation invariant features for constructing the baseline.•The method improves the performance of the state-of-the-art character segmentation method substantially.

Many handwritten text recognition systems use the baseline information for better recognition of text line characters. Improper baseline detection reduces the performance of the recognition. In this paper we propose a novel baseline detection scheme for unconstrained handwritten text lines of multilingual documents. For baseline detection of a text line, at first, we detect the set of significant contour points (S-points) of the text line. Every non-singleton subsets of S-points forms a curve. The orientation invariant features of the curve determine whether the curve can construct a probable baseline of the input text line or not. It is determined by an SVM, trained using the orientation invariant features of the curves. The curves classified as probable baselines, are sorted according to their relative positions in ascending order to get the optimal baseline. We tested our method on different handwritten text lines of Bangla(Bengali), English(Roman), Kannada, Oriya, Devnagari and Persian scripts and obtained encouraging results.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, ,