Combined (static and dynamic) analysis of binary code

This paper investigates the process of binary code analysis. To achieve typical goals (such as extracting algorithm and data formats, exploiting vulnerabilities, revealing backdoors and undocumented features) a security analyst needs to explore control and data flow, reconstruct functions and variab...

Full description

Bibliographic Details
Main Authors: A. YU. Tikhonov, A. I. Avetisyan
Format: Article
Language:English
Published: Ivannikov Institute for System Programming of the Russian Academy of Sciences 2018-10-01
Series:Труды Института системного программирования РАН
Subjects:
Online Access:https://ispranproceedings.elpub.ru/jour/article/view/1007
Description
Summary:This paper investigates the process of binary code analysis. To achieve typical goals (such as extracting algorithm and data formats, exploiting vulnerabilities, revealing backdoors and undocumented features) a security analyst needs to explore control and data flow, reconstruct functions and variables, identify input and output data. Traditionally for this purposes disassemblers and other static data flow analysis tools have been used. However, since developers have been taking steps to protect their programs from analysis (for example, code being unpacked or decrypted at runtime), static analysis may not yield results.In such cases we propose to use dynamic analysis (analysis of execution traces of the program) to complement static. The problems that arise in the analysis of binary programs are discussed, and corresponding ways to automate solving them are suggested. The core of proposed method consists of whole system tracing and consecutive representation lifting: OS-aware events, process/thread identification, fully automated control and data flow reconstruction. The only manual step is searching for anchor instructions in the trace, e.g. I/O operations, which are used as input criteria for another automated step: precise algorithm extraction by trace slicing. The final step of the method constructs static test case code suitable for further analysis in tools such as IDA Pro. We implemented the proposed approach in an environment for dynamic analysis of binary code and evaluated it against a model example and two real-world examples: a program license manager and a malware program. Our results show that approach successfully explores algorithms and extracts them from whole system traces. The required efforts and amount of time are significantly reduced as compared with traditional disassembler and interactive debugger.
ISSN:2079-8156
2220-6426