کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
411753 679589 2015 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
A class-oriented feature selection approach for multi-class imbalanced network traffic datasets based on local and global metrics fusion
ترجمه فارسی عنوان
یک روش انتخاب کلاس محور برای مجموعه داده های ترافیک شبکه ای با عدم تعادل چند طبقه براساس همگام سازی متریک های محلی و جهانی
کلمات کلیدی
انتخاب ویژگی، معیارهای محلی، عدم تعادل چند طبقه، راندگی داده ها، ترافیک شبکه
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
چکیده انگلیسی

Feature selection is often used as a pre-processing step for machine learning based network traffic classification. Many feature selection techniques have been developed to find an optimal subset of relevant features and to improve overall classification accuracy. But such techniques ignore the class imbalance problem encountered in network traffic classification. The selected feature subset may bias towards the traffic class that occupies the majority of traffic flows on the Internet. To address this issue, this paper proposes a new approach, called class-oriented feature selection (COFS), to identify a relevant feature subset for every class. It combines the proposed local metric and the existing global metric to yield a potentially optimal feature subset for each class, and then removes the redundant features in each feature subset based on the weighted symmetric uncertainty. Additionally, to enhance the generalization on network traffic data, an ensemble learning based scheme is presented with COFS to overcome the negative impacts of the data drift on a traffic classifier. Experiments on real-world network traffic data show that COFS outperforms existing feature selection techniques in most cases. Moreover, our approach achieves >96% flow accuracy and >93% byte accuracy on average.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Neurocomputing - Volume 168, 30 November 2015, Pages 365–381
نویسندگان
, , , ,