کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
529289 869643 2010 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Segmentation of historical machine-printed documents using Adaptive Run Length Smoothing and skeleton segmentation paths
چکیده انگلیسی

In this paper, we strive towards the development of efficient techniques in order to segment document pages resulting from the digitization of historical machine-printed sources. This kind of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects: (i) use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) in order to face the problem of complex and dense document layout, (ii) detection of noisy areas and punctuation marks that are usual in historical machine-printed documents, (iii) detection of possible obstacles formed from background areas in order to separate neighboring text columns or text lines, and (iv) use of skeleton segmentation paths in order to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Image and Vision Computing - Volume 28, Issue 4, April 2010, Pages 590–604
نویسندگان
, , , , ,