Article ID Journal Published Year Pages File Type
6884521 Digital Investigation 2015 15 Pages PDF
Abstract
Reverse engineering is the primary step to analyze a piece of malware. After having disassembled a malware binary, a reverse engineer needs to spend extensive effort analyzing the resulting assembly code, and then documenting it through comments in the assembly code for future references. In this paper, we have developed an assembly code clone search system called ScalClone based on our previous work on assembly code clone detection systems. The objective of the system is to identify the code clones of a target malware from a collection of previously analyzed malware binaries. Our new contributions are summarized as follows: First, we introduce two assembly code clone search methods for malware analysis with a high recall rate. Second, our methods allow malware analysts to discover both exact and inexact clones at different token normalization levels. Third, we present a scalable system with a database model to support large-scale assembly code search. Finally, experimental results on real-life malware binaries suggest that our proposed methods can effectively identify assembly code clones with the consideration of different scenarios of code mutations.
Related Topics
Physical Sciences and Engineering Computer Science Computer Networks and Communications
Authors
, , , , , ,