A Phishing-Attack-Detection Model Using Natural Language Processing and Deep Learning

Phishing is a type of cyber-attack that aims to deceive users, usually using fraudulent web pages that appear legitimate. Currently, one of the most-common ways to detect these phishing pages according to their content is by entering words non-sequentially into Deep Learning (DL) algorithms, i.e., r...

Full description

Bibliographic Details
Main Authors:	Benavides-Astudillo, E. (Author), Fuertes, W. (Author), Nuñez-Agurto, D. (Author), Rodríguez-Galán, G. (Author), Sanchez-Gordon, S. (Author)
Format:	Article
Language:	English
Published:	MDPI 2023
Subjects:	BiGRU BiLSTM deep learning GloVe GRU Keras embedding LSTM natural language processing NLP phishing
Online Access:	View Fulltext in Publisher View in Scopus


LEADER	02489nam a2200313Ia 4500
001	10.3390-app13095275
008	230529s2023 CNT 000 0 und d
020			\|a 20763417 (ISSN)
245	1	0	\|a A Phishing-Attack-Detection Model Using Natural Language Processing and Deep Learning
260		0	\|b MDPI \|c 2023
856			\|z View Fulltext in Publisher \|u https://doi.org/10.3390/app13095275
856			\|z View in Scopus \|u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159296697&doi=10.3390%2fapp13095275&partnerID=40&md5=233d4bcc6788c1b56669c51f9566ecfc
520	3		\|a Phishing is a type of cyber-attack that aims to deceive users, usually using fraudulent web pages that appear legitimate. Currently, one of the most-common ways to detect these phishing pages according to their content is by entering words non-sequentially into Deep Learning (DL) algorithms, i.e., regardless of the order in which they have entered the algorithms. However, this approach causes the intrinsic richness of the relationship between words to be lost. In the field of cyber-security, the innovation of this study is to propose a model that detects phishing attacks based on the text of suspicious web pages and not on URL addresses, using Natural Language Processing (NLP) and DL algorithms. We used the Keras Embedding Layer with Global Vectors for Word Representation (GloVe) to exploit the web page content’s semantic and syntactic features. We first performed an analysis using NLP and Word Embedding, and then, these data were introduced into a DL algorithm. In addition, to assess which DL algorithm works best, we evaluated four alternative algorithms: Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), and Bidirectional GRU (BiGRU). As a result, it can be concluded that the proposed model is promising because the mean accuracy achieved by each of the four DL algorithms was at least 96.7%, while the best performer was BiGRU with 97.39%. © 2023 by the authors.
650	0	4	\|a BiGRU
650	0	4	\|a BiLSTM
650	0	4	\|a deep learning
650	0	4	\|a GloVe
650	0	4	\|a GRU
650	0	4	\|a Keras embedding
650	0	4	\|a LSTM
650	0	4	\|a natural language processing
650	0	4	\|a NLP
650	0	4	\|a phishing
700	1	0	\|a Benavides-Astudillo, E. \|e author
700	1	0	\|a Fuertes, W. \|e author
700	1	0	\|a Nuñez-Agurto, D. \|e author
700	1	0	\|a Rodríguez-Galán, G. \|e author
700	1	0	\|a Sanchez-Gordon, S. \|e author
773			\|t Applied Sciences (Switzerland)

A Phishing-Attack-Detection Model Using Natural Language Processing and Deep Learning

Similar Items