کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536365 870505 2014 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
An annotation assistance system using an unsupervised codebook composed of handwritten graphical multi-stroke symbols
ترجمه فارسی عنوان
یک سیستم کمک کردن حاشیه نویسی با استفاده از یک کدبیت بدون نظارت که از نمادهای چند وجهی گرافیکی دست نوشته ای تشکیل شده است
کلمات کلیدی
نماد گرافیکی استخراج دانش، بازیابی نماد گرافیکی، روابط فضایی، حداقل طول شرح اصل شرح، خط خطی دست خط
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی

Many present recognition systems take advantage of ground-truthed datasets for training, evaluating and testing. But the creation of ground-truthed datasets is a tedious task. This paper proposes an iterative unsupervised handwritten graphical symbols learning framework which can be used for assisting such a labeling task. Initializing each stroke as a segment, we construct a relational graph between the segments where the nodes are the segments and the edges are the spatial relations between them. To extract the relevant patterns, a quantization of segments and spatial relations is implemented. Discovering graphical symbols becomes then the problem of finding the sub-graphs according to the Minimum Description Length (MDL) principle. The discovered graphical symbols will become the new segments for the next iteration. In each iteration, the quantization of segments yields the codebook in which the user can label graphical symbols. This original method has been first applied on a dataset of simple mathematical expressions. The results reported in this work show that only 58.2% of the strokes have to be manually labeled.


► A semi-automatic annotation system for unknown 2D graphical languages is proposed.
► A multi-stroke symbol codebook based on recurrent patterns is automatically defined.
► A relational graph between graphical units and Minimum Description Length are used.
► First experiments on a handwritten corpus show a reduction of the labeling cost.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 35, 1 January 2014, Pages 46–57
نویسندگان
, , ,