Article ID Journal Published Year Pages File Type
722645 The Journal of China Universities of Posts and Telecommunications 2010 5 Pages PDF
Abstract

The diminished accuracy of port-based and payload-based classification motivates use of transport layer statistics for network traffic classification. A semi-supervised clustering approach based on improved K-Means clustering algorithm is proposed in this paper to partition a training network flows set that contains a huge number of unlabeled flows and scarce labeled flows. The variance of flow attributes is used to initialize clusters centers instead of the random selection of the cluster centers in initialization. Scarce labeled flows are selected to construct a mapping from the clusters to the predefined traffic classes set. The experimental results show that both the overall accuracy and square error (SSE) value of our algorithm present better than those based on normal K-Means algorithm defined in Ref. [5].

Related Topics
Physical Sciences and Engineering Engineering Electrical and Electronic Engineering