Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
536378 | Pattern Recognition Letters | 2014 | 9 Pages |
•We build a compact Japanese text recognizer by the integrated evaluation model.•The model includes combined recognizer, geometric and linguistic contexts.•The text recognizer with 4438-classes costs only 32 MB memory and much less time.•We find a smaller offline recognizer can get the best accuracy in the above model.•Further we prove the linguistic context is more efficient than geometric context.
The paper presents complexity reduction of an on-line handwritten Japanese text recognition system by selecting an optimal off-line recognizer in combination with an on-line recognizer, geometric context evaluation, and linguistic context evaluation. The result is that a surprisingly simple off-line recognizer, which is weak on its own, produces nearly the best recognition rate in combination with other evaluation factors in remarkably small space-and-time complexity. Generally, lower dimensions with fewer principal components produce a smaller set of prototypes, which reduces memory-cost and time–cost. This degrades the recognition rate, however, so we need to reach a compromise. In an evaluation function with the above-mentioned multiple factors combined, the configuration of only 50 dimensions with as few as 5 principal components for the off-line recognizer keeps almost the best accuracy 98.23% (the best accuracy 98.34%) for text recognition while it reduces the total memory-cost to 1/3 (from 99.4 MB down to 32 MB) and the average time–cost of character recognition for text recognition to 4/5 (from 0.1672 ms to 0.1349 ms per character) compared with the traditional off-line recognizer with 160 dimensions and 50 principal components.
Graphical abstract3D surf graph of text recognition accuracies by using different text recognizers.Figure optionsDownload full-size imageDownload high-quality image (119 K)Download as PowerPoint slide