Combined (static and dynamic) analysis of binary code

This paper investigates the process of binary code analysis. To achieve typical goals (such as extracting algorithm and data formats, exploiting vulnerabilities, revealing backdoors and undocumented features) a security analyst needs to explore control and data flow, reconstruct functions and variab...

Full description

Bibliographic Details
Main Authors: A. YU. Tikhonov, A. I. Avetisyan
Format: Article
Language:English
Published: Ivannikov Institute for System Programming of the Russian Academy of Sciences 2018-10-01
Series:Труды Института системного программирования РАН
Subjects:
Online Access:https://ispranproceedings.elpub.ru/jour/article/view/1007
id doaj-f96b05c276d046998118c5352f470d18
record_format Article
spelling doaj-f96b05c276d046998118c5352f470d182020-11-25T02:06:24Zeng Ivannikov Institute for System Programming of the Russian Academy of SciencesТруды Института системного программирования РАН2079-81562220-64262018-10-012201007Combined (static and dynamic) analysis of binary codeA. YU. Tikhonov0A. I. Avetisyan1ИСП РАНИСП РАНThis paper investigates the process of binary code analysis. To achieve typical goals (such as extracting algorithm and data formats, exploiting vulnerabilities, revealing backdoors and undocumented features) a security analyst needs to explore control and data flow, reconstruct functions and variables, identify input and output data. Traditionally for this purposes disassemblers and other static data flow analysis tools have been used. However, since developers have been taking steps to protect their programs from analysis (for example, code being unpacked or decrypted at runtime), static analysis may not yield results.In such cases we propose to use dynamic analysis (analysis of execution traces of the program) to complement static. The problems that arise in the analysis of binary programs are discussed, and corresponding ways to automate solving them are suggested. The core of proposed method consists of whole system tracing and consecutive representation lifting: OS-aware events, process/thread identification, fully automated control and data flow reconstruction. The only manual step is searching for anchor instructions in the trace, e.g. I/O operations, which are used as input criteria for another automated step: precise algorithm extraction by trace slicing. The final step of the method constructs static test case code suitable for further analysis in tools such as IDA Pro. We implemented the proposed approach in an environment for dynamic analysis of binary code and evaluated it against a model example and two real-world examples: a program license manager and a malware program. Our results show that approach successfully explores algorithms and extracts them from whole system traces. The required efforts and amount of time are significantly reduced as compared with traditional disassembler and interactive debugger.https://ispranproceedings.elpub.ru/jour/article/view/1007восстановление алгоритмов по бинарному кодупоиск недокументированных возможностей в бинарном кодеметоды получения трасс бинарных программ
collection DOAJ
language English
format Article
sources DOAJ
author A. YU. Tikhonov
A. I. Avetisyan
spellingShingle A. YU. Tikhonov
A. I. Avetisyan
Combined (static and dynamic) analysis of binary code
Труды Института системного программирования РАН
восстановление алгоритмов по бинарному коду
поиск недокументированных возможностей в бинарном коде
методы получения трасс бинарных программ
author_facet A. YU. Tikhonov
A. I. Avetisyan
author_sort A. YU. Tikhonov
title Combined (static and dynamic) analysis of binary code
title_short Combined (static and dynamic) analysis of binary code
title_full Combined (static and dynamic) analysis of binary code
title_fullStr Combined (static and dynamic) analysis of binary code
title_full_unstemmed Combined (static and dynamic) analysis of binary code
title_sort combined (static and dynamic) analysis of binary code
publisher Ivannikov Institute for System Programming of the Russian Academy of Sciences
series Труды Института системного программирования РАН
issn 2079-8156
2220-6426
publishDate 2018-10-01
description This paper investigates the process of binary code analysis. To achieve typical goals (such as extracting algorithm and data formats, exploiting vulnerabilities, revealing backdoors and undocumented features) a security analyst needs to explore control and data flow, reconstruct functions and variables, identify input and output data. Traditionally for this purposes disassemblers and other static data flow analysis tools have been used. However, since developers have been taking steps to protect their programs from analysis (for example, code being unpacked or decrypted at runtime), static analysis may not yield results.In such cases we propose to use dynamic analysis (analysis of execution traces of the program) to complement static. The problems that arise in the analysis of binary programs are discussed, and corresponding ways to automate solving them are suggested. The core of proposed method consists of whole system tracing and consecutive representation lifting: OS-aware events, process/thread identification, fully automated control and data flow reconstruction. The only manual step is searching for anchor instructions in the trace, e.g. I/O operations, which are used as input criteria for another automated step: precise algorithm extraction by trace slicing. The final step of the method constructs static test case code suitable for further analysis in tools such as IDA Pro. We implemented the proposed approach in an environment for dynamic analysis of binary code and evaluated it against a model example and two real-world examples: a program license manager and a malware program. Our results show that approach successfully explores algorithms and extracts them from whole system traces. The required efforts and amount of time are significantly reduced as compared with traditional disassembler and interactive debugger.
topic восстановление алгоритмов по бинарному коду
поиск недокументированных возможностей в бинарном коде
методы получения трасс бинарных программ
url https://ispranproceedings.elpub.ru/jour/article/view/1007
work_keys_str_mv AT ayutikhonov combinedstaticanddynamicanalysisofbinarycode
AT aiavetisyan combinedstaticanddynamicanalysisofbinarycode
_version_ 1724934144142606336