کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6939472 1449971 2018 38 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Accumulating regional density dissimilarity for concept drift detection in data streams
ترجمه فارسی عنوان
تقسیم تنوع چگالی منطقه ای برای تشخیص راندگی مفهوم در جریان داده ها
کلمات کلیدی
مفهوم رانش تغییرات داده ها، تغییر محیط، تغییر چابک، تخمین چگالی تجربی،
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی
In a non-stationary environment, newly received data may have different knowledge patterns from the data used to train learning models. As time passes, a learning model's performance may become increasingly unreliable. This problem is known as concept drift and is a common issue in real-world domains. Concept drift detection has attracted increasing attention in recent years. However, very few existing methods pay attention to small regional drifts, and their accuracy may vary due to differing statistical significance tests. This paper presents a novel concept drift detection method, based on regional-density estimation, named nearest neighbor-based density variation identification (NN-DVI). It consists of three components. The first is a k-nearest neighbor-based space-partitioning schema (NNPS), which transforms unmeasurable discrete data instances into a set of shared subspaces for density estimation. The second is a distance function that accumulates the density discrepancies in these subspaces and quantifies the overall differences. The third component is a tailored statistical significance test by which the confidence interval of a concept drift can be accurately determined. The distance applied in NN-DVI is sensitive to regional drift and has been proven to follow a normal distribution. As a result, the NN-DVI's accuracy and false-alarm rate are statistically guaranteed. Additionally, several benchmarks have been used to evaluate the method, including both synthetic and real-world datasets. The overall results show that NN-DVI has better performance in terms of addressing problems related to concept drift-detection.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 76, April 2018, Pages 256-272
نویسندگان
, , , ,