کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
458566 | 696170 | 2012 | 23 صفحه PDF | دانلود رایگان |

Malware encyclopedias now play a vital role in disseminating information about security threats. Coupled with categorization and generalization capabilities, such encyclopedias might help better defend against both isolated and clustered specimens.In this paper, we present Malware Evaluator, a classification framework that treats malware categorization as a supervised learning task, builds learning models with both support vector machines and decision trees and finally, visualizes classifications with self-organizing maps. Malware Evaluator refrains from using readily available taxonomic features to produce species classifications. Instead, we generate attributes of malware strains via a tokenization process and select the attributes used according to their projected information gain. We also deploy word stemming and stopword removal techniques to reduce dimensions of the feature space. In contrast to existing approaches, Malware Evaluator defines its taxonomic features based on the behavior of species throughout their life-cycle, allowing it to discover properties that previously might have gone unobserved. The learning and generalization capabilities of the framework also help detect and categorize zero-day attacks. Our prototype helps establish that malicious strains improve their penetration rate through multiple propagation channels as well as compact code footprints; moreover, they attempt to evade detection by resorting to code polymorphism and information encryption. Malware Evaluator also reveals that breeds in the categories of Trojan, Infector, Backdoor, and Worm significantly contribute to the malware population and impose critical risks on the Internet ecosystem.
Figure optionsDownload as PowerPoint slideHighlights
► A malware evaluation framework is proposed to not only cluster species encountered on the Internet according to taxonomic features covering their life cycle but to also help evaluate their evolution in an automated manner.
► The malware evaluation framework bases its operation on encyclopedia entries such as those of TrendMicro and Symantec, and treats the species classification problem at hand as a machine learning task that is tackled with support vector machines, gradient boosting decision trees, and self-organizing maps.
► The proposed malware evaluator quickly and efficiently helps confirm facts about penetration avenues, reveal detection avoidance techniques and outline the most popular species of current malware.
► With the machine-learned models built in the framework, the malware evaluator can automatically classify both existing malware breeds and those yet to be discovered, it is also capable of identifying and categorizing zero-day attacks by integrating with other automated malware analysis tools such as Norman Sandbox.
Journal: Journal of Systems and Software - Volume 85, Issue 7, July 2012, Pages 1650–1672