Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files

Simple implementation and autonomous operation features make the Internet-of-Things (IoT) vulnerable to malware attacks. Static analysis of IoT malware executable files is a feasible approach to understanding the behavior of IoT malware for mitigation and prevention. However, current analytic approa...

Full description

Bibliographic Details
Main Authors: Tzu-Ling Wan, Tao Ban, Shin-Ming Cheng, Yen-Ting Lee, Bo Sun, Ryoichi Isawa, Takeshi Takahashi, Daisuke Inoue
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Open Journal of the Computer Society
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9240051/
id doaj-93095ac455ec43aa8ef547e7c2c6a61e
record_format Article
spelling doaj-93095ac455ec43aa8ef547e7c2c6a61e2021-03-29T16:59:26ZengIEEEIEEE Open Journal of the Computer Society2644-12682020-01-01126227510.1109/OJCS.2020.30339749240051Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable FilesTzu-Ling Wan0Tao Ban1https://orcid.org/0000-0002-9616-3212Shin-Ming Cheng2https://orcid.org/0000-0002-9796-0643Yen-Ting Lee3Bo Sun4Ryoichi Isawa5Takeshi Takahashi6Daisuke Inoue7Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanDepartment of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanDepartment of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, TaiwanSaitama Institute of Technology, Saitama, JapanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanNational Institute of Information and Communications Technology, Koganei, Tokyo, JapanSimple implementation and autonomous operation features make the Internet-of-Things (IoT) vulnerable to malware attacks. Static analysis of IoT malware executable files is a feasible approach to understanding the behavior of IoT malware for mitigation and prevention. However, current analytic approaches based on opcodes or call graphs typically do not work well with diversity in central processing unit (CPU) architectures and are often resource intensive. In this paper, we propose an efficient method for leveraging machine learning methods to detect and classify IoT malware programs. We show that reliable and efficient detection and classification can be achieved by exploring the essential discriminating information stored in the byte sequences at the entry points of executable programs. We demonstrate the performance of the proposed method using a large-scale dataset consisting of 111K benignware and 111K malware programs from seven CPU architectures. The proposed method achieves near optimal generalization performance for malware detection (99.96% accuracy) and for malware family classification (98.47% accuracy). Moreover, when CPU architecture information is considered in learning, the proposed method combined with support vector machine classifiers can yield even higher generalization performance using fewer bytes from the executable files. The findings in this paper are promising for implementing light-weight malware protection on IoT devices with limited resources.https://ieeexplore.ieee.org/document/9240051/Computer securitymachine learningbinary codemalware analysisstatic analysis
collection DOAJ
language English
format Article
sources DOAJ
author Tzu-Ling Wan
Tao Ban
Shin-Ming Cheng
Yen-Ting Lee
Bo Sun
Ryoichi Isawa
Takeshi Takahashi
Daisuke Inoue
spellingShingle Tzu-Ling Wan
Tao Ban
Shin-Ming Cheng
Yen-Ting Lee
Bo Sun
Ryoichi Isawa
Takeshi Takahashi
Daisuke Inoue
Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
IEEE Open Journal of the Computer Society
Computer security
machine learning
binary code
malware analysis
static analysis
author_facet Tzu-Ling Wan
Tao Ban
Shin-Ming Cheng
Yen-Ting Lee
Bo Sun
Ryoichi Isawa
Takeshi Takahashi
Daisuke Inoue
author_sort Tzu-Ling Wan
title Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
title_short Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
title_full Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
title_fullStr Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
title_full_unstemmed Efficient Detection and Classification of Internet-of-Things Malware Based on Byte Sequences from Executable Files
title_sort efficient detection and classification of internet-of-things malware based on byte sequences from executable files
publisher IEEE
series IEEE Open Journal of the Computer Society
issn 2644-1268
publishDate 2020-01-01
description Simple implementation and autonomous operation features make the Internet-of-Things (IoT) vulnerable to malware attacks. Static analysis of IoT malware executable files is a feasible approach to understanding the behavior of IoT malware for mitigation and prevention. However, current analytic approaches based on opcodes or call graphs typically do not work well with diversity in central processing unit (CPU) architectures and are often resource intensive. In this paper, we propose an efficient method for leveraging machine learning methods to detect and classify IoT malware programs. We show that reliable and efficient detection and classification can be achieved by exploring the essential discriminating information stored in the byte sequences at the entry points of executable programs. We demonstrate the performance of the proposed method using a large-scale dataset consisting of 111K benignware and 111K malware programs from seven CPU architectures. The proposed method achieves near optimal generalization performance for malware detection (99.96% accuracy) and for malware family classification (98.47% accuracy). Moreover, when CPU architecture information is considered in learning, the proposed method combined with support vector machine classifiers can yield even higher generalization performance using fewer bytes from the executable files. The findings in this paper are promising for implementing light-weight malware protection on IoT devices with limited resources.
topic Computer security
machine learning
binary code
malware analysis
static analysis
url https://ieeexplore.ieee.org/document/9240051/
work_keys_str_mv AT tzulingwan efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT taoban efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT shinmingcheng efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT yentinglee efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT bosun efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT ryoichiisawa efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT takeshitakahashi efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
AT daisukeinoue efficientdetectionandclassificationofinternetofthingsmalwarebasedonbytesequencesfromexecutablefiles
_version_ 1724198438362415104