Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification
There are many cases of email abuse that have the potential to harm others. This email abuse is commonly known as spam, which contains advertisements, phishing scams, and even malware. This study purpose to know the classification of email spam with ham using the KNN method as an effort to reduce th...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | Indonesian |
Published: |
Ikatan Ahli Indormatika Indonesia
2020-04-01
|
Series: | Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) |
Subjects: | |
Online Access: | http://jurnal.iaii.or.id/index.php/RESTI/article/view/1845 |
id |
doaj-c970c8edb32a42339fdfadbcb85cdda0 |
---|---|
record_format |
Article |
spelling |
doaj-c970c8edb32a42339fdfadbcb85cdda02020-11-25T02:01:34ZindIkatan Ahli Indormatika IndonesiaJurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)2580-07602020-04-014237738310.29207/resti.v4i2.18451845Optimization of K Value in KNN Algorithm for Spam and Ham Email ClassificationEko Laksono0Achmad Basuki1Fitra Bachtiar2Brawijaya UniversityBrawijaya UniversityBrawijaya UniversityThere are many cases of email abuse that have the potential to harm others. This email abuse is commonly known as spam, which contains advertisements, phishing scams, and even malware. This study purpose to know the classification of email spam with ham using the KNN method as an effort to reduce the amount of spam. KNN can classify spam or ham in an email by checking it using a different K value approach. The results of the classification evaluation using confusion matrix resulted in the KNN method with a value of K = 1 having the highest accuracy value of 91.4%. From the results of the study, it is known that the optimization of the K value in KNN using frequency distribution clustering can produce high accuracy of 100%, while k-means clustering produces an accuracy of 99%. So based on the results of the existing accuracy values, the frequency distribution clustering and k-means clustering can be used to optimize the K-optimal value of the KNN in the classification of existing spam emails.http://jurnal.iaii.or.id/index.php/RESTI/article/view/1845classification, email spam, knn, frequency distribution clustering, k-means clustering |
collection |
DOAJ |
language |
Indonesian |
format |
Article |
sources |
DOAJ |
author |
Eko Laksono Achmad Basuki Fitra Bachtiar |
spellingShingle |
Eko Laksono Achmad Basuki Fitra Bachtiar Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) classification, email spam, knn, frequency distribution clustering, k-means clustering |
author_facet |
Eko Laksono Achmad Basuki Fitra Bachtiar |
author_sort |
Eko Laksono |
title |
Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification |
title_short |
Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification |
title_full |
Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification |
title_fullStr |
Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification |
title_full_unstemmed |
Optimization of K Value in KNN Algorithm for Spam and Ham Email Classification |
title_sort |
optimization of k value in knn algorithm for spam and ham email classification |
publisher |
Ikatan Ahli Indormatika Indonesia |
series |
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) |
issn |
2580-0760 |
publishDate |
2020-04-01 |
description |
There are many cases of email abuse that have the potential to harm others. This email abuse is commonly known as spam, which contains advertisements, phishing scams, and even malware. This study purpose to know the classification of email spam with ham using the KNN method as an effort to reduce the amount of spam. KNN can classify spam or ham in an email by checking it using a different K value approach. The results of the classification evaluation using confusion matrix resulted in the KNN method with a value of K = 1 having the highest accuracy value of 91.4%. From the results of the study, it is known that the optimization of the K value in KNN using frequency distribution clustering can produce high accuracy of 100%, while k-means clustering produces an accuracy of 99%. So based on the results of the existing accuracy values, the frequency distribution clustering and k-means clustering can be used to optimize the K-optimal value of the KNN in the classification of existing spam emails. |
topic |
classification, email spam, knn, frequency distribution clustering, k-means clustering |
url |
http://jurnal.iaii.or.id/index.php/RESTI/article/view/1845 |
work_keys_str_mv |
AT ekolaksono optimizationofkvalueinknnalgorithmforspamandhamemailclassification AT achmadbasuki optimizationofkvalueinknnalgorithmforspamandhamemailclassification AT fitrabachtiar optimizationofkvalueinknnalgorithmforspamandhamemailclassification |
_version_ |
1724957041072537600 |