کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
4947177 1439567 2017 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Three-layer concept drifting detection in text data streams
ترجمه فارسی عنوان
تشخیص مفهوم سه لایه در جریان داده های متنی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی
Text data streams have widely appeared in real-world applications, in which, concept drifts owe a significant challenge for classification. Compared with relational data streams, concept drifts hidden in text streams usually reflect in the relationship between the feature vector and the instance labels. Meanwhile, existing concept drifting detection methods are mainly based on error rates of classification. When applying these methods in text streams, they perform poorly in the evaluations of false alarms and missing detections, etc. Motivated by this, we firstly give a systematic analysis of the concept drifts in text data streams. Then, we propose a three-layer concept drifting detection approach, where the three layers indicate the layer of label space, the layer of feature space and the layer of the mapping relationships between labels and features, respectively. In this approach, the latter two layers are based on the values of WoE (Weight of Evidence) and the IV (Information Value) index. Experimental results show that our approach can improve the performance of concept drifting detection and the accuracy of classification, especially when concept drifts in text data streams are frequent.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 260, 18 October 2017, Pages 393-403
نویسندگان
, , , , ,