Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
526725 | Image and Vision Computing | 2013 | 11 Pages |
•We propose a contextual word model for keyword spotting from handwritten Chinese documents.•The contextual word model combines character classifier, geometric and linguistic contexts.•Promising results were obtained on a large handwriting database CASIA-HWDB.•The geometric and linguistic contexts improve the spotting performance significantly.
This paper proposes a method for keyword spotting in off-line Chinese handwritten documents using a contextual word model, which measures the similarity between the query word and every candidate word in the document by combining a character classifier and the geometric context as well as linguistic context. The geometric context model characterizes the single-character likeliness and between-character relationship. The linguistic model utilizes the dependency of the word with the external adjacent characters. The combining weights are optimized on training documents. Experiments on a large handwriting database CASIA-HWDB demonstrate the effectiveness of the proposed method and justify the benefits of geometric and linguistic contexts. Compared to transcription-based text search, the proposed method can provide higher recall rate, and for spotting words of four characters, the proposed method provides both higher precision and recall rate.