Numeral Script Identification from Handwritten Document Images

Article ID	Journal	Published Year	Pages	File Type
487495	Procedia Computer Science	2015	10 Pages	PDF

Abstract

In this paper a novel HNSI (Handwritten Numeral Script Identification) framework to identify scripts from document images containing numeral text written by any one of the four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu has been proposed. A dataset of 4000 word-level numeral images with equal distribution of each script type are collected from different individuals with varying age, sex and educational qualification. Some spatial and frequency domain features has been computed and a 55-dimensional feature vector is developed. During experimentation the whole dataset is divided into 2:1 ratio for training and testing. Performance of different classifiers is compared and MLP is found to be the best one while evaluating accuracy rate for combinations like Four-scripts, Tri-scripts and Bi-scripts.