کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6903115 1446750 2018 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
On botnet detection with genetic programming under streaming data label budgets and class imbalance
ترجمه فارسی عنوان
در تشخیص بوت نت با برنامه نویسی ژنتیکی تحت بودجه پخش برچسب های داده ها و عدم تعادل کلاس
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
چکیده انگلیسی
Algorithms for constructing models of classification under streaming data scenarios are becoming increasingly important. In order for such algorithms to be applicable under 'real-world' contexts we adopt the following objectives: 1) operate under label budgets, 2) make label requests without recourse to true label information, and 3) robustness to class imbalance. Specifically, we assume that model building is only performed using the content of a Data Subset (as in active learning). Thus, the principle design decisions are with regard to the definitions employed for sampling and archiving policies. Moreover, these policies should operate without prior information regarding the distribution of classes, as this varies over the course of the stream. A team formulation for genetic programming (GP) is assumed as the generic model for classification in order to support incremental changes to classifier content. Benchmarking is conducted with thirteen real-world Botnet datasets with label budgets of the order of 0.5-5% and significant amounts of class imbalance. Specific recommendations are made for detecting the costly minor classes under these conditions. Comparison with current approaches to streaming data under label budgets supports the significance of these findings.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Swarm and Evolutionary Computation - Volume 39, April 2018, Pages 123-140
نویسندگان
, , , ,