Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data

Good governance was a government whose programs were known and beneficial to the people. In Bali Provincial Government which has duty in disseminating information is Bureau of Public Relations Regional Secretariat Bali through media owned. Because at the time of news input to the media in this case...

Full description

Bibliographic Details
Main Authors: Nyoman Gede Yudiarta, Made Sudarma, Wayan Gede Ariastina
Format: Article
Language:English
Published: Universitas Udayana 2018-12-01
Series:Majalah Ilmiah Teknologi Elektro
Online Access:https://ojs.unud.ac.id/index.php/JTE/article/view/41047
id doaj-a64c9a4fecda4bc18fdfa0268e4db181
record_format Article
spelling doaj-a64c9a4fecda4bc18fdfa0268e4db1812020-11-25T02:36:57ZengUniversitas UdayanaMajalah Ilmiah Teknologi Elektro1693-29512503-23722018-12-0117333934410.24843/MITE.2018.v17i03.P0641047Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual DataNyoman Gede YudiartaMade SudarmaWayan Gede AriastinaGood governance was a government whose programs were known and beneficial to the people. In Bali Provincial Government which has duty in disseminating information is Bureau of Public Relations Regional Secretariat Bali through media owned. Because at the time of news input to the media in this case Public Relations Bureau website was not included causing the emergence of problems in the form of difficulty knowing the news, which news that goes into certain categories. Clustering was a method to solve the problem. One of the algorithms used in the Clustering method is the K-Means algorithm. This study focused on designing to classify news data into a category using K-Means. To process the documents obtained to make it easier in the process of clustering, was done by preprocess documents first. Document preparation consists of case folding, tokenization, filtering and stemming. Tf-Idf was done to pass the weighting of the terms obtained on the preprocessed documents. From the results of experiments conducted using different amounts of data that are 50, 100, 200, 300, 400, and 500 data obtained results that the K-Means algorithm applied to cluster news, able to work and provide a satisfactory accuracy, Precision average of 73.11% while Recall of 69.65% and Purity of 0.80 for all test data. When viewed the comparison of each test data, the test on 50 data has the highest average precision and recall rate of 76.92% for its precision and for its recall of 79.58% while for Purity its highest value is on testing 300 data that is equal to 0.83.https://ojs.unud.ac.id/index.php/JTE/article/view/41047
collection DOAJ
language English
format Article
sources DOAJ
author Nyoman Gede Yudiarta
Made Sudarma
Wayan Gede Ariastina
spellingShingle Nyoman Gede Yudiarta
Made Sudarma
Wayan Gede Ariastina
Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
Majalah Ilmiah Teknologi Elektro
author_facet Nyoman Gede Yudiarta
Made Sudarma
Wayan Gede Ariastina
author_sort Nyoman Gede Yudiarta
title Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
title_short Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
title_full Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
title_fullStr Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
title_full_unstemmed Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data
title_sort penerapan metode clustering text mining untuk pengelompokan berita pada unstructured textual data
publisher Universitas Udayana
series Majalah Ilmiah Teknologi Elektro
issn 1693-2951
2503-2372
publishDate 2018-12-01
description Good governance was a government whose programs were known and beneficial to the people. In Bali Provincial Government which has duty in disseminating information is Bureau of Public Relations Regional Secretariat Bali through media owned. Because at the time of news input to the media in this case Public Relations Bureau website was not included causing the emergence of problems in the form of difficulty knowing the news, which news that goes into certain categories. Clustering was a method to solve the problem. One of the algorithms used in the Clustering method is the K-Means algorithm. This study focused on designing to classify news data into a category using K-Means. To process the documents obtained to make it easier in the process of clustering, was done by preprocess documents first. Document preparation consists of case folding, tokenization, filtering and stemming. Tf-Idf was done to pass the weighting of the terms obtained on the preprocessed documents. From the results of experiments conducted using different amounts of data that are 50, 100, 200, 300, 400, and 500 data obtained results that the K-Means algorithm applied to cluster news, able to work and provide a satisfactory accuracy, Precision average of 73.11% while Recall of 69.65% and Purity of 0.80 for all test data. When viewed the comparison of each test data, the test on 50 data has the highest average precision and recall rate of 76.92% for its precision and for its recall of 79.58% while for Purity its highest value is on testing 300 data that is equal to 0.83.
url https://ojs.unud.ac.id/index.php/JTE/article/view/41047
work_keys_str_mv AT nyomangedeyudiarta penerapanmetodeclusteringtextmininguntukpengelompokanberitapadaunstructuredtextualdata
AT madesudarma penerapanmetodeclusteringtextmininguntukpengelompokanberitapadaunstructuredtextualdata
AT wayangedeariastina penerapanmetodeclusteringtextmininguntukpengelompokanberitapadaunstructuredtextualdata
_version_ 1724797848147460096