Article ID Journal Published Year Pages File Type
526725 Image and Vision Computing 2013 11 Pages PDF
Abstract

•We propose a contextual word model for keyword spotting from handwritten Chinese documents.•The contextual word model combines character classifier, geometric and linguistic contexts.•Promising results were obtained on a large handwriting database CASIA-HWDB.•The geometric and linguistic contexts improve the spotting performance significantly.

This paper proposes a method for keyword spotting in off-line Chinese handwritten documents using a contextual word model, which measures the similarity between the query word and every candidate word in the document by combining a character classifier and the geometric context as well as linguistic context. The geometric context model characterizes the single-character likeliness and between-character relationship. The linguistic model utilizes the dependency of the word with the external adjacent characters. The combining weights are optimized on training documents. Experiments on a large handwriting database CASIA-HWDB demonstrate the effectiveness of the proposed method and justify the benefits of geometric and linguistic contexts. Compared to transcription-based text search, the proposed method can provide higher recall rate, and for spotting words of four characters, the proposed method provides both higher precision and recall rate.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , ,