کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
495836 862841 2012 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatically building datasets of labeled IP traffic traces: A self-training approach
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
پیش نمایش صفحه اول مقاله
Automatically building datasets of labeled IP traffic traces: A self-training approach
چکیده انگلیسی

Many approaches have been proposed so far to tackle computer network security. Among them, several systems exploit Machine Learning and Pattern Recognition techniques, by regarding malicious behavior detection as a classification problem. Supervised and unsupervised algorithms have been used in this context, each one with its own benefits and shortcomings. When using supervised techniques, a representative training set is required, which reliably indicates what a human expert wants the system to learn and recognize, by means of suitably labeled samples. In real environments there is a significant difficulty in collecting a representative dataset of correctly labeled traffic traces. In adversarial environments such a task is made even harder by malicious attackers, trying to make their actions’ evidences stealthy.In order to overcome this problem, a self-training system is presented in this paper, building a dataset of labeled network traffic based on raw tcpdump traces and no prior knowledge on data. Results on both emulated and real traffic traces have shown that intrusion detection systems trained on such a dataset perform as well as the same systems trained on correctly hand-labeled data.

Figure optionsDownload as PowerPoint slideHighlights
► Many approaches have been proposed to tackle computer network security, most of them based on supervised approaches.
► In real environments it is significantly difficult to collect a representative dataset of correctly labeled traffic traces.
► In adversarial environments this is made even harder by malicious attackers trying to make their actions’ evidences stealthy.
► A self-training system based on the Dempster–Shafer theory is presented.
► It builds a dataset of labeled network traffic based on raw tcpdump traces. No prior knowledge on data is required.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Applied Soft Computing - Volume 12, Issue 6, June 2012, Pages 1640–1649
نویسندگان
, , ,