کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
430702 688122 2014 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Internet traffic clustering with side information
ترجمه فارسی عنوان
خوشه بندی ترافیک اینترنت با اطلاعات جانبی
کلمات کلیدی
طبقه بندی ترافیک، یادگیری ماشین نیمه نظارت، خوشه بندی محدود
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• Side information: most TCP traffic flows are associated with some other flows.
• A novel constrained Gaussian mixture model is proposed.
• Evaluation shows that this model produces purer traffic clusters than other models.

Internet traffic classification is a critical and essential functionality for network management and security systems. Due to the limitations of traditional port-based and payload-based classification approaches, the past several years have seen extensive research on utilizing machine learning techniques to classify Internet traffic based on packet and flow level characteristics. For the purpose of learning from unlabeled traffic data, some classic clustering methods have been applied in previous studies but the reported accuracy results are unsatisfactory. In this paper, we propose a semi-supervised approach for accurate Internet traffic clustering, which is motivated by the observation of widely existing partial equivalence relationships among Internet traffic flows. In particular, we formulate the problem using a Gaussian Mixture Model (GMM) with set-based equivalence constraint and propose a constrained Expectation Maximization (EM) algorithm for clustering. Experiments with real-world packet traces show that the proposed approach can significantly improve the quality of resultant traffic clusters.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computer and System Sciences - Volume 80, Issue 5, August 2014, Pages 1021–1036
نویسندگان
, , , , ,