کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530211 869750 2015 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Matching based ground-truth annotation for online handwritten mathematical expressions
ترجمه فارسی عنوان
تطبیق حاشیه نویسی زمین حقایق برای عبارات ریاضی دست نویس آنلاین
کلمات کلیدی
تجزیه و تحلیل الگوی ساختاری، اطلاعات شکل، تشخیص دست خط، مسئله تخصیص خطی، حاشیه نویسی زمین، مجموعه داده های بیان ریاضی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• A method for annotating ground-truth data to handwritten mathematical expressions.
• Expression matching formulated as a single assignment problem.
• Evaluation of the influence of local (symbol) and global (structural) features on matching performance.
• Overall mean symbol assignment rate above 99%.

Assessment of mathematical expression recognition at expression level only is not sufficient to diagnose strengths and weaknesses of different recognition systems. In order to make assessment at different levels possible, large datasets annotated with ground-truth data at different levels, such as at symbol segmentation, symbol classification, symbol/sub-expression spatial relationships, baselines or whole expression levels, are needed. Creation of ground-truthed datasets of handwritten mathematical expressions is a challenging task due to the need to cope with a large variability of symbol classes, expression layouts, writing styles, among other issues including the fact that manual annotation is an error-prone procedure. We propose an expression matching approach where symbols in a transcribed expression are assigned to the corresponding symbols in the respective model expression. Matching is formulated as a simple linear assignment problem where matching cost is defined as a weighted linear combination of local (symbol) and global (structural) characteristics. Once a symbol-to-symbol assignment is computed, not only symbol labels but all other ground-truth data attached to the model expression can be automatically transferred to the transcribed expression. We use two independent large test sets to empirically evaluate the influence of the cost function terms on matching performance. Results show mean symbol assignment rates above 99% on both sets, suggesting the potential of the method as an useful tool for helping the creation of ground-truthed online mathematical expression datasets.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 3, March 2015, Pages 837–848
نویسندگان
, ,