کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530132 869745 2012 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A synthesised word approach to word retrieval in handwritten documents
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
A synthesised word approach to word retrieval in handwritten documents
چکیده انگلیسی

Recent technological advances have enhanced the computer-based indexing and searching of digitised printed books. The performance now achievable in this domain, however, does not at present extend to handwritten texts which inherently contain more significant letter-based variation within their content. Furthermore, in most studies that address the handwritten text retrieval problem, a large training dataset is required which, very often, influences the context and search lexicon. In this paper a novel method is described to overcome the training data problem using a character-based modelling (termed grapheme spectrum) approach and a word modelling technique (termed synthesised word) enabling the retrieval of keywords that have not explicitly been seen in the training set. When tested on an illustrative historical manuscript the performance of the proposed word retrieval technique shows a clear advantage over existing methods.


► An unconstrained keyword retrieval system for handwritten documents is constructed.
► A novel feature-based approach for modelling out-of-vocabulary words is described.
► The system accepts a training dataset with uneven occurrence of characters.
► The approach is successful even when using small training datasets.
► The method offers significant practical benefits for the exploration of manuscripts.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 45, Issue 12, December 2012, Pages 4225–4236
نویسندگان
, , ,