کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10322280 660850 2015 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Batch-adaptive rejection threshold estimation with application to OCR post-processing
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Batch-adaptive rejection threshold estimation with application to OCR post-processing
چکیده انگلیسی
The approach is based on the estimation of an expected error vs. transformation cost distribution. First, a model predicting the probability of a cost to arise from an erroneously transcribed string is computed from a sample of supervised OCR hypotheses. Then, given a test sample, a cumulative error vs. cost curve is computed and used to automatically set the appropriate threshold that meets the user-defined error rate on the overall sample. The results of experiments on batches coming from different writing styles show very accurate error rate estimations where fixed thresholding clearly fails. An original procedure to generate distorted strings from a given language is also proposed and tested, which allows the use of the presented method in tasks where no real supervised OCR hypotheses are available to train the system.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Expert Systems with Applications - Volume 42, Issue 21, 30 November 2015, Pages 8111-8122
نویسندگان
, , , ,