کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10360380 869792 2014 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Semi-supervised learning for character recognition in historical archive documents
ترجمه فارسی عنوان
یادگیری نیمه نظارتی برای شناخت شخصیت در اسناد آرشیو تاریخی
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
Training recognizers for handwritten characters is still a very time consuming task involving tremendous amounts of manual annotations by experts. In this paper we present semi-supervised labeling strategies that are able to considerably reduce the human effort. We propose two different methods to label and later recognize characters in collections of historical archive documents. The first one is based on clustering of different feature representations and the second one incorporates a simultaneous retrieval on different representations. Hence, both approaches are based on multi-view learning and later apply a voting procedure for reliably propagating annotations to unlabeled data. We evaluate our methods on the MNIST database of handwritten digits and introduce a realistic application in form of a database of handwritten historical weather reports. The experiments show that our method is able to significantly reduce the human effort that is required to build a character recognizer for the data collection considered while still achieving recognition rates that are close to a supervised classification experiment.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 47, Issue 3, March 2014, Pages 1011-1020
نویسندگان
, , , ,