Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
530807	869790	2008	10 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Discriminative training - آموزش تبعیض آمیز Text extraction - استخراج متن EM algorithm - الگوریتم EM Image retrieval - بازیابی تصویر Document analysis - تجزیه و تحلیل سند character recognition - شخصیت شناسی Gaussian mixture models - مدل مخلوط گاوسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش صفحه اول مقاله

Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

چکیده انگلیسی

This paper proposes an approach based on the statistical modeling and learning of neighboring characters to extract multilingual texts in images. The case of three neighboring characters is represented as the Gaussian mixture model and discriminated from other cases by the corresponding ‘pseudo-probability’ defined under Bayes framework. Based on this modeling, text extraction is completed through labeling each connected component in the binary image as character or non-character according to its neighbors, where a mathematical morphology based method is introduced to detect and connect the separated parts of each character, and a Voronoi partition based method is advised to establish the neighborhoods of connected components. We further present a discriminative training algorithm based on the maximum–minimum similarity (MMS) criterion to estimate the parameters in the proposed text extraction approach. Experimental results in Chinese and English text extraction demonstrate the effectiveness of our approach trained with the MMS algorithm, which achieved the precision rate of 93.56% and the recall rate of 98.55% for the test data set. In the experiments, we also show that the MMS provides significant improvement of overall performance, compared with influential training criterions of the maximum likelihood (ML) and the maximum classification error (MCE).

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 41, Issue 2, February 2008, Pages 484–493

نویسندگان

Xiabi Liu, Hui Fu, Yunde Jia,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images

دسترسی سریع

ارتباط

English Website