دانلود رایگان مقاله: داده های بدون برچسب ممکن است دقت طبقه بندی را بهبود بخشد

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
534539	870265	2014	9 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Unlabeling data can improve classification accuracy

ترجمه فارسی عنوان

داده های بدون برچسب ممکن است دقت طبقه بندی را بهبود بخشد

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Partially supervised learning Microarray data - داده های Microarray Classification - طبقه بندی Transductive learning - یادگیری انتقالی Semi-supervised learning - یاگیری نیمه‌نظارتی

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش مقاله

داده های بدون برچسب ممکن است دقت طبقه بندی را بهبود بخشد

چکیده انگلیسی

• Early classification in clinical studies is possible with transductive learners.
• Removal of labels can improve classification accuracy.
• Algorithms are mislead to different degrees by correctly labeled data.
• Data has the strongest influence on prediction not percentage of labeling.

In this study we focus on the effects of sample limitations on partially supervised learning algorithms. We analyze the performance of these types of learning algorithms on small datasets under varying trade-offs between labeled and unlabeled samples. In contrast to the typical settings for partially supervised learning algorithms, the number of available unlabeled samples is also restricted.We utilize gene expression datasets, which are typical examples of data collections of small sample size. DNA microarrays are used to generate these profiles by measuring thousands of mRNA values simultaneously. These profiles are increasingly used for tumor categorization. Partially labeled microarray datasets occur naturally in the diagnostic setting if the corresponding labeling process is time consuming or expensive (i.e., “early relapse” vs. “late relapse”).Surprisingly, the best classification results in our study were not always achieved for a maximal proportion of labeled samples. This is unexpected as asymptotical results for an unlimited amount of samples suggest that a labeled sample is of an exponentially higher value than an unlabeled one. Our analysis shows that in the case of finite sample sizes a more balanced trade-off between labeled and unlabeled samples is optimal. This trade-off was not unique over all experiments. It could be shown that the optimal trade-off between unlabeled and labeled samples is mainly dependent on the chosen learning algorithm.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 37, 1 February 2014, Pages 15–23

نویسندگان

Ludwig Lausser, Florian Schmid, Matthias Schmid, Hans A. Kestler,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : داده های بدون برچسب ممکن است دقت طبقه بندی را بهبود بخشد

دسترسی سریع

ارتباط

English Website