Article ID Journal Published Year Pages File Type
531822 Pattern Recognition 2016 16 Pages PDF
Abstract

•Novel comprehensive Persian handwritten database with thousands of items is offered.•A framework for creating databases for offline handwriting recognition is proposed.•An opportunity for tens of pattern recognition fields of research has been created.•Detailed ground truth has been provided for all the presented itemsets and samples.

Developing a standard database for offline handwriting recognition is an essential task. This paper offers a novel comprehensive database for conducting research on offline Persian handwriting recognition. Seven pages of forms were designed and completed by 500 native Persian writers, who were equally balanced in terms of gender and randomly selected from all over Iran. Then, the completed forms were scanned at a resolution of 300 DPI. Through several intensive processing steps, a huge number of isolated digits, numeral strings, touching digits, dates, words, names, alphabetical letters, free texts, arithmetic, and especial symbols from all these forms were extracted and organized as a standard database. All samples in this database were assigned with detailed ground truth and stored in three color formats: true color, gray level, and binary. Also, all subsets of this database were randomly partitioned into training, validation, and testing sets. We hope this comprehensive database will extend research in the pattern recognition community.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,