Learning-Based Detection for Malicious Android Application Using Code Vectorization

The malicious APK (Android Application Package) makers use some techniques such as code obfuscation and code encryption to avoid existing detection methods, which poses new challenges for accurate virus detection and makes it more and more difficult to detect the malicious code. A report indicates t...

Full description

Bibliographic Details
Main Authors: Lin Liu, Wang Ren, Feng Xie, Shengwei Yi, Junkai Yi, Peng Jia
Format: Article
Language:English
Published: Hindawi-Wiley 2021-01-01
Series:Security and Communication Networks
Online Access:http://dx.doi.org/10.1155/2021/9964224
id doaj-ec14b05c39464cef8b15424276401862
record_format Article
spelling doaj-ec14b05c39464cef8b154242764018622021-08-30T00:00:23ZengHindawi-WileySecurity and Communication Networks1939-01222021-01-01202110.1155/2021/9964224Learning-Based Detection for Malicious Android Application Using Code VectorizationLin Liu0Wang Ren1Feng Xie2Shengwei Yi3Junkai Yi4Peng Jia5China Information Technology Security Evaluation CenterChina Information Technology Security Evaluation CenterChina Information Technology Security Evaluation CenterChina Information Technology Security Evaluation CenterBeijing Information Science and Technology UniversityCollege of CybersecurityThe malicious APK (Android Application Package) makers use some techniques such as code obfuscation and code encryption to avoid existing detection methods, which poses new challenges for accurate virus detection and makes it more and more difficult to detect the malicious code. A report indicates that a new malicious app for Android is created every 10 seconds. To combat this serious malware activity, a scalable malware detection approach is needed, which can effectively and efficiently identify the malware apps. Common static detection methods often rely on Hash matching and analysis of viruses, which cannot quickly detect new malicious Android applications and their variants. In this paper, a malicious Android application detection method is proposed, which is implemented by the deep network fusion model. The hybrid model only needs to use the sample training model to achieve high accuracy in the identification of the malicious applications, which is more suitable for the detection of the new malicious Android applications than the existing methods. This method extracts the static features in the core code of the Android application by decompiling APK files, then performs code vectorization processing, and uses the deep learning network for classification and discrimination. Our experiments with a data set containing 10,170 apps show that the decisions from the hybrid model can increase the malware detection rate significantly on a real device, which verifies the superiority of this method in the detection of malicious codes.http://dx.doi.org/10.1155/2021/9964224
collection DOAJ
language English
format Article
sources DOAJ
author Lin Liu
Wang Ren
Feng Xie
Shengwei Yi
Junkai Yi
Peng Jia
spellingShingle Lin Liu
Wang Ren
Feng Xie
Shengwei Yi
Junkai Yi
Peng Jia
Learning-Based Detection for Malicious Android Application Using Code Vectorization
Security and Communication Networks
author_facet Lin Liu
Wang Ren
Feng Xie
Shengwei Yi
Junkai Yi
Peng Jia
author_sort Lin Liu
title Learning-Based Detection for Malicious Android Application Using Code Vectorization
title_short Learning-Based Detection for Malicious Android Application Using Code Vectorization
title_full Learning-Based Detection for Malicious Android Application Using Code Vectorization
title_fullStr Learning-Based Detection for Malicious Android Application Using Code Vectorization
title_full_unstemmed Learning-Based Detection for Malicious Android Application Using Code Vectorization
title_sort learning-based detection for malicious android application using code vectorization
publisher Hindawi-Wiley
series Security and Communication Networks
issn 1939-0122
publishDate 2021-01-01
description The malicious APK (Android Application Package) makers use some techniques such as code obfuscation and code encryption to avoid existing detection methods, which poses new challenges for accurate virus detection and makes it more and more difficult to detect the malicious code. A report indicates that a new malicious app for Android is created every 10 seconds. To combat this serious malware activity, a scalable malware detection approach is needed, which can effectively and efficiently identify the malware apps. Common static detection methods often rely on Hash matching and analysis of viruses, which cannot quickly detect new malicious Android applications and their variants. In this paper, a malicious Android application detection method is proposed, which is implemented by the deep network fusion model. The hybrid model only needs to use the sample training model to achieve high accuracy in the identification of the malicious applications, which is more suitable for the detection of the new malicious Android applications than the existing methods. This method extracts the static features in the core code of the Android application by decompiling APK files, then performs code vectorization processing, and uses the deep learning network for classification and discrimination. Our experiments with a data set containing 10,170 apps show that the decisions from the hybrid model can increase the malware detection rate significantly on a real device, which verifies the superiority of this method in the detection of malicious codes.
url http://dx.doi.org/10.1155/2021/9964224
work_keys_str_mv AT linliu learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
AT wangren learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
AT fengxie learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
AT shengweiyi learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
AT junkaiyi learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
AT pengjia learningbaseddetectionformaliciousandroidapplicationusingcodevectorization
_version_ 1721186270958321664