A Lightweight Android Malware Classifier Using Novel Feature Selection Methods

Smartphones and mobile tablets play significant roles in daily life and have led to an increase in the number of users of this technology. The rising number of mobile device end-users has resulted in the generation of malware by hackers. Thus, mobile devices are becoming vulnerable to malware. Machi...

Full description

Bibliographic Details
Main Authors: Ahmad Salah, Eman Shalabi, Walid Khedr
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Symmetry
Subjects:
SVM
Online Access:https://www.mdpi.com/2073-8994/12/5/858
id doaj-5ec47cc900964d6ab68b4f87f9ebcb37
record_format Article
spelling doaj-5ec47cc900964d6ab68b4f87f9ebcb372020-11-25T03:21:22ZengMDPI AGSymmetry2073-89942020-05-011285885810.3390/sym12050858A Lightweight Android Malware Classifier Using Novel Feature Selection MethodsAhmad Salah0Eman Shalabi1Walid Khedr2College of Computer Science and Electrical Engineering, Hunan University, Changsha 410082, ChinaFaculty of Computers and Informatics, Zagazig University, Zagazig 44519, EgyptFaculty of Computers and Informatics, Zagazig University, Zagazig 44519, EgyptSmartphones and mobile tablets play significant roles in daily life and have led to an increase in the number of users of this technology. The rising number of mobile device end-users has resulted in the generation of malware by hackers. Thus, mobile devices are becoming vulnerable to malware. Machine learning plays an important role in the detection of mobile malware applications. In this study, we focus on static analysis for Android malware detection. The ultimate goal of this research is to find out the symmetric features across the malware Android application to easily detect them. Many state-of-the-art methods focus on extracting asymmetric patterns of the category of features, e.g., application permissions to distinguish the malware application from the benign application. In this work, we propose a compromise by considering different types of static features and select the most important features that affect the detection process. These features represent the symmetric pattern to be used for the classification task. Inspired by TF-IDF, we propose a novel method of feature selection. Moreover, we propose a new method for merging the Android application URLs into a single feature called the <i>URL_score</i>. Several linear machine learning classifiers are utilized to evaluate the proposed method. The proposed methods significantly reduce the feature space, i.e., the symmetric pattern, of the Android application dataset and the memory size of the final model. In addition, the proposed model achieves the highest reported accuracy for the Drebin dataset to date. Based on the evaluation results, the linear support vector machine achieves an accuracy of 99%.https://www.mdpi.com/2073-8994/12/5/858malware detectionAndroid malwareclassifierSVMfeature selectionTF-IDF
collection DOAJ
language English
format Article
sources DOAJ
author Ahmad Salah
Eman Shalabi
Walid Khedr
spellingShingle Ahmad Salah
Eman Shalabi
Walid Khedr
A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
Symmetry
malware detection
Android malware
classifier
SVM
feature selection
TF-IDF
author_facet Ahmad Salah
Eman Shalabi
Walid Khedr
author_sort Ahmad Salah
title A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
title_short A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
title_full A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
title_fullStr A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
title_full_unstemmed A Lightweight Android Malware Classifier Using Novel Feature Selection Methods
title_sort lightweight android malware classifier using novel feature selection methods
publisher MDPI AG
series Symmetry
issn 2073-8994
publishDate 2020-05-01
description Smartphones and mobile tablets play significant roles in daily life and have led to an increase in the number of users of this technology. The rising number of mobile device end-users has resulted in the generation of malware by hackers. Thus, mobile devices are becoming vulnerable to malware. Machine learning plays an important role in the detection of mobile malware applications. In this study, we focus on static analysis for Android malware detection. The ultimate goal of this research is to find out the symmetric features across the malware Android application to easily detect them. Many state-of-the-art methods focus on extracting asymmetric patterns of the category of features, e.g., application permissions to distinguish the malware application from the benign application. In this work, we propose a compromise by considering different types of static features and select the most important features that affect the detection process. These features represent the symmetric pattern to be used for the classification task. Inspired by TF-IDF, we propose a novel method of feature selection. Moreover, we propose a new method for merging the Android application URLs into a single feature called the <i>URL_score</i>. Several linear machine learning classifiers are utilized to evaluate the proposed method. The proposed methods significantly reduce the feature space, i.e., the symmetric pattern, of the Android application dataset and the memory size of the final model. In addition, the proposed model achieves the highest reported accuracy for the Drebin dataset to date. Based on the evaluation results, the linear support vector machine achieves an accuracy of 99%.
topic malware detection
Android malware
classifier
SVM
feature selection
TF-IDF
url https://www.mdpi.com/2073-8994/12/5/858
work_keys_str_mv AT ahmadsalah alightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
AT emanshalabi alightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
AT walidkhedr alightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
AT ahmadsalah lightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
AT emanshalabi lightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
AT walidkhedr lightweightandroidmalwareclassifierusingnovelfeatureselectionmethods
_version_ 1724615125339471872