کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
10341030 695319 2014 13 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Contextual modeling for logical labeling of PDF documents
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Contextual modeling for logical labeling of PDF documents
چکیده انگلیسی
The widely-used Portable Document Format (PDF) documents are known to be layout-oriented and not suitable for mobile applications. In this paper, a Conditional Random Fields (CRF) based model is proposed to learn latent semantics of PDF page content. Local and contextual observations constructed from PDF attributes are incorporated to facilitate the determination of semantic roles. The observations are carefully designed to work even in different styles of documents. A local classifier is first used to generate posterior probabilities. The local estimate is then fed to the CRF model for joint classification. The experimental results evidently approve the positive effects of contextual information in logical labeling. Our work has revealed the potential usability of existing born-digital fixed-layout documents for mobile applications.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers & Electrical Engineering - Volume 40, Issue 4, May 2014, Pages 1363-1375
نویسندگان
, , ,