دانلود رایگان مقاله: نسل حقیقی تولید نیمه اتوماتیک با استفاده از خوشه بندی بدون نظارت و برچسب زدن دستی محدود: کاربرد به رسمیت شناختن شخصیت دست نویس

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
533803	870167	2015	6 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: Application to handwritten character recognition

ترجمه فارسی عنوان

نسل حقیقی تولید نیمه اتوماتیک با استفاده از خوشه بندی بدون نظارت و برچسب زدن دستی محدود: کاربرد به رسمیت شناختن شخصیت دست نویس

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شخصیت شناسی، ترکیبی طبقه بندی، خوشه بندی انتخاب ویژگی

Feature selection - انتخاب ویژگی Classifier combination - ترکیبی طبقه بندی Clustering - خوشه بندی character recognition - شخصیت شناسی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش مقاله

نسل حقیقی تولید نیمه اتوماتیک با استفاده از خوشه بندی بدون نظارت و برچسب زدن دستی محدود: کاربرد به رسمیت شناختن شخصیت دست نویس

چکیده انگلیسی

• We propose a fast and accurate semi-automatic labeling strategy.
• Human expert knowledge (time and cost) is reduced to minimum.
• Unsupervised clustering and voting mechanism decide for the labels.
• The method is generic, can be applied to other type of data.

For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 58, 1 June 2015, Pages 23–28

نویسندگان

Szilárd Vajda, Yves Rangoni, Hubert Cecotti,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : نسل حقیقی تولید نیمه اتوماتیک با استفاده از خوشه بندی بدون نظارت و برچسب زدن دستی محدود: کاربرد به رسمیت شناختن شخصیت دست نویس

دسترسی سریع

ارتباط

English Website