Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6884521 | Digital Investigation | 2015 | 15 Pages |
Abstract
Reverse engineering is the primary step to analyze a piece of malware. After having disassembled a malware binary, a reverse engineer needs to spend extensive effort analyzing the resulting assembly code, and then documenting it through comments in the assembly code for future references. In this paper, we have developed an assembly code clone search system called ScalClone based on our previous work on assembly code clone detection systems. The objective of the system is to identify the code clones of a target malware from a collection of previously analyzed malware binaries. Our new contributions are summarized as follows: First, we introduce two assembly code clone search methods for malware analysis with a high recall rate. Second, our methods allow malware analysts to discover both exact and inexact clones at different token normalization levels. Third, we present a scalable system with a database model to support large-scale assembly code search. Finally, experimental results on real-life malware binaries suggest that our proposed methods can effectively identify assembly code clones with the consideration of different scenarios of code mutations.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Networks and Communications
Authors
Mohammad Reza Farhadi, Benjamin C.M. Fung, Yin Bun Fung, Philippe Charland, Stere Preda, Mourad Debbabi,