Article ID Journal Published Year Pages File Type
6884279 Computers & Security 2015 16 Pages PDF
Abstract
This paper introduces AMAL, an automated and behavior-based malware analysis and labeling system that addresses shortcomings of the existing systems. AMAL consists of two sub-systems, AutoMal and MaLabel. AutoMal provides tools to collect low granularity behavioral artifacts that characterize malware usage of the file system, memory, network, and registry, and does that by running malware samples in virtualized environments. On the other hand, MaLabel uses those artifacts to create representative features, use them for building classifiers trained by manually vetted training samples, and use those classifiers to classify malware samples into families similar in behavior. AutoMal also enables unsupervised learning, by implementing multiple clustering algorithms for samples grouping. An evaluation of both AutoMal and MaLabel based on medium-scale (4000 samples) and large-scale datasets (more than 115,000 samples)-collected and analyzed by AutoMal over 13 months-shows AMAL's effectiveness in accurately characterizing, classifying, and grouping malware samples. MaLabel achieves a precision of 99.5% and recall of 99.6% for certain families' classification, and more than 98% of precision and recall for unsupervised clustering. Several benchmarks, cost estimates and measurements highlight the merits of AMAL.
Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , ,