Article ID Journal Published Year Pages File Type
484178 Procedia Computer Science 2016 8 Pages PDF
Abstract

Script identification is challenging task in bilingual or multi-lingual optical character recognition system. A remarkable research work on script identification have been noted in Indian or non-Indian context. As many commercial and official regional documents of different states of India are in bilingual containing one regional language of respective state and the other international intersperse language English. Therefore script identification is one of the primary tasks in multi-script document recognition. English words are mostly interspersed in regional documents of different states of India. In this paper script identification of Gujarati and English at word level is presented. For feature extraction the directional energy distribution of a word using Gabor filters is used with suitable frequencies and orientations. The proposed system uses SVM classifier to classify the extracted features in one of the script. The results obtained are quiet encouraging.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)