دانلود رایگان مقاله: شناسایی فضای ورودی و مفهوم هدف، راندگی در جریان داده ها با نمونه های به سختی برچسب و بدون برچسب

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
392582	664991	2016	25 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances

ترجمه فارسی عنوان

شناسایی فضای ورودی و مفهوم هدف، راندگی در جریان داده ها با نمونه های به سختی برچسب و بدون برچسب

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

طبقه بندی داده ها فضای ورودی و مفهوم هدف، راندگی، تشخیص رانش، جریانهای ناقص و بدون برچسب، شاخص های عملکرد نیمه نظارت و کنترل نشده، فیلتر یادگیری فعال تنها در گذر

Drift detection - تشخیص رانش Data stream classification - طبقه بندی داده ها

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی

پیش نمایش مقاله

شناسایی فضای ورودی و مفهوم هدف، راندگی در جریان داده ها با نمونه های به سختی برچسب و بدون برچسب

چکیده انگلیسی

In classification-based stream mining, drift detection is essential in order to (i) inform operators when unintended system changes occur and (ii) make classifier updates more flexible when changes are intentional. Current detection approaches usually rely on the assumption that fully supervised labeled streams are available for monitoring (the changes in) classifier performance. This is an unrealistic scenario in many on-line real-world applications as true class labels would have to be known, which usually requires tedious feedback efforts of operators working with the systems. We propose two techniques to improve economy and applicability of current drift detection techniques: (i) a semi-supervised approach that employs single-pass active learning filters to select the most interesting samples for supervising classifier performance and (ii) a fully unsupervised approach based on the degree of overlap between a classifier’s output certainty distributions that can be applied to any unlabeled classification stream. For both variants, a specific handling of imbalanced class distributions in the streams is proposed, which allows also possible downtrends in classifier behavior for under-represented classes to be observed. The statistical monitoring of classifier behavior relies on a modified version of the Page-Hinkley test, where a fading factor and an automatic thresholding concept (based on the Hoeffding bound) were introduced to render it more flexible for detecting successive drift occurrences in a stream. We compared our approaches to the fully supervised variant in two real-world on-line applications, including a systematic analysis of the capabilities of our methods. The semi-supervised approach was able to detect real as well as artificially built-in drifts in these streams with a similar delay (of about 5–6 min) as the supervised variant, and this with only 20% actively selected samples. The unsupervised variant was able to detect input space drifts with reasonable delays as well, but failed to detect target concept drifts — using both approaches in tandem therefore allows us to distinguish between input space and target concept drifts.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volumes 355–356, 10 August 2016, Pages 127–151

نویسندگان

Edwin Lughofer, Eva Weigl, Wolfgang Heidl, Christian Eitzinger, Thomas Radauer,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : شناسایی فضای ورودی و مفهوم هدف، راندگی در جریان داده ها با نمونه های به سختی برچسب و بدون برچسب

دسترسی سریع

ارتباط

English Website