A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques

In today’s security landscape, advanced threats are becoming increasingly difficult to detect as the pattern of attacks expands. Classical approaches that rely heavily on static matching, such as blacklisting or regular expression patterns, may be limited in flexibility or uncertainty in detecting m...

Full description

Bibliographic Details
Main Authors: Ali Soleymani, Fatemeh Arabgol
Format: Article
Language:English
Published: Hindawi Limited 2021-01-01
Series:Journal of Computer Networks and Communications
Online Access:http://dx.doi.org/10.1155/2021/4767388
id doaj-5a5fb3374a454d5481a7118be6ade52b
record_format Article
spelling doaj-5a5fb3374a454d5481a7118be6ade52b2021-07-19T01:04:11ZengHindawi LimitedJournal of Computer Networks and Communications2090-715X2021-01-01202110.1155/2021/4767388A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning TechniquesAli Soleymani0Fatemeh Arabgol1Faculty of Computer EngineeringFaculty of Computer EngineeringIn today’s security landscape, advanced threats are becoming increasingly difficult to detect as the pattern of attacks expands. Classical approaches that rely heavily on static matching, such as blacklisting or regular expression patterns, may be limited in flexibility or uncertainty in detecting malicious data in system data. This is where machine learning techniques can show their value and provide new insights and higher detection rates. The behavior of botnets that use domain-flux techniques to hide command and control channels was investigated in this research. The machine learning algorithm and text mining used to analyze the network DNS protocol and identify botnets were also described. For this purpose, extracted and labeled domain name datasets containing healthy and infected DGA botnet data were used. Data preprocessing techniques based on a text-mining approach were applied to explore domain name strings with n-gram analysis and PCA. Its performance is improved by extracting statistical features by principal component analysis. The performance of the proposed model has been evaluated using different classifiers of machine learning algorithms such as decision tree, support vector machine, random forest, and logistic regression. Experimental results show that the random forest algorithm can be used effectively in botnet detection and has the best botnet detection accuracy.http://dx.doi.org/10.1155/2021/4767388
collection DOAJ
language English
format Article
sources DOAJ
author Ali Soleymani
Fatemeh Arabgol
spellingShingle Ali Soleymani
Fatemeh Arabgol
A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
Journal of Computer Networks and Communications
author_facet Ali Soleymani
Fatemeh Arabgol
author_sort Ali Soleymani
title A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
title_short A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
title_full A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
title_fullStr A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
title_full_unstemmed A Novel Approach for Detecting DGA-Based Botnets in DNS Queries Using Machine Learning Techniques
title_sort novel approach for detecting dga-based botnets in dns queries using machine learning techniques
publisher Hindawi Limited
series Journal of Computer Networks and Communications
issn 2090-715X
publishDate 2021-01-01
description In today’s security landscape, advanced threats are becoming increasingly difficult to detect as the pattern of attacks expands. Classical approaches that rely heavily on static matching, such as blacklisting or regular expression patterns, may be limited in flexibility or uncertainty in detecting malicious data in system data. This is where machine learning techniques can show their value and provide new insights and higher detection rates. The behavior of botnets that use domain-flux techniques to hide command and control channels was investigated in this research. The machine learning algorithm and text mining used to analyze the network DNS protocol and identify botnets were also described. For this purpose, extracted and labeled domain name datasets containing healthy and infected DGA botnet data were used. Data preprocessing techniques based on a text-mining approach were applied to explore domain name strings with n-gram analysis and PCA. Its performance is improved by extracting statistical features by principal component analysis. The performance of the proposed model has been evaluated using different classifiers of machine learning algorithms such as decision tree, support vector machine, random forest, and logistic regression. Experimental results show that the random forest algorithm can be used effectively in botnet detection and has the best botnet detection accuracy.
url http://dx.doi.org/10.1155/2021/4767388
work_keys_str_mv AT alisoleymani anovelapproachfordetectingdgabasedbotnetsindnsqueriesusingmachinelearningtechniques
AT fatemeharabgol anovelapproachfordetectingdgabasedbotnetsindnsqueriesusingmachinelearningtechniques
AT alisoleymani novelapproachfordetectingdgabasedbotnetsindnsqueriesusingmachinelearningtechniques
AT fatemeharabgol novelapproachfordetectingdgabasedbotnetsindnsqueriesusingmachinelearningtechniques
_version_ 1721295549804576768