A multinomial logistic regression modeling approach for anomaly intrusion detection

Article ID	Journal	Published Year	Pages	File Type
10341432	Computers & Security	2005	13 Pages	PDF

Abstract

Although researchers have long studied using statistical modeling techniques to detect anomaly intrusion and profile user behavior, the feasibility of applying multinomial logistic regression modeling to predict multi-attack types has not been addressed, and the risk factors associated with individual major attacks remain unclear. To address the gaps, this study used the KDD-cup 1999 data and bootstrap simulation method to fit 3000 multinomial logistic regression models with the most frequent attack types (probe, DoS, U2R, and R2L) as an unordered independent variable, and identified 13 risk factors that are statistically significantly associated with these attacks. These risk factors were then used to construct a final multinomial model that had an ROC area of 0.99 for detecting abnormal events. Compared with the top KDD-cup 1999 winning results that were based on a rule-based decision tree algorithm, the multinomial logistic model-based classification results had similar sensitivity values in detecting normal (98.3% vs. 99.5%), probe (85.6% vs. 83.3%), and DoS (97.2% vs. 97.1%); remarkably high sensitivity in U2R (25.9% vs. 13.2%) and R2L (11.2% vs. 8.4%); and a significantly lower overall misclassification rate (18.9% vs. 35.7%). The study emphasizes that the multinomial logistic regression modeling technique with the 13 risk factors provides a robust approach to detect anomaly intrusion.

Keywords

Computer security Bootstrap Intrusion detection Classification