کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
529879 869719 2015 15 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Document dewarping via text-line based optimization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
پیش نمایش صفحه اول مقاله
Document dewarping via text-line based optimization
چکیده انگلیسی


• Document dewarping is an important problem in camera-based OCR.
• We formulate the dewarping as an optimization problem.
• The proposed method performs dewarping in a fully automatic manner.
• The proposed method can handle various layouts of documents.
• Our method yields the improved OCR performances compared with other methods.

This paper presents a new document image dewarping method that removes geometric distortions in camera-captured document images. The proposed method does not directly use the text-line which has been the most widely used feature for the document dewarping. Instead, we use the discrete representation of text-lines and text-blocks which are the sets of connected components. Also, we model the geometric distortions caused by page curl and perspective view as the generalized cylindrical surfaces and camera rotation respectively. With these distortion models and the discrete representation of the features, we design a cost function whose minimization yields the parameters of the distortion model. In the cost function, we encode the properties of the pages such as text-block alignment, line-spacing, and the straightness of text-lines. By describing the text features using the sets of discrete points, the cost function can be easily defined and efficiently solved by Levenberg–Marquadt algorithm. Experiments show that the proposed method works well for the various layouts and curved surfaces, and compares favorably with the conventional methods on the standard dataset.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 11, November 2015, Pages 3600–3614
نویسندگان
, , ,