Ina-BWR: Indonesian bigram word rule for multi-label student complaints

Handling multi-label student complaints is one of interesting research topics. One of techniques used for handling multi-label student complaints is Bag of Word (BoW) method. In this research bigram word rule and preprocess are proposed to increase the accuracy of multi-label classification results....

Full description

Bibliographic Details
Main Authors: Tora Fahrudin, Joko Lianto Buliali, Chastine Fatichah
Format: Article
Language:English
Published: Elsevier 2019-11-01
Series:Egyptian Informatics Journal
Online Access:http://www.sciencedirect.com/science/article/pii/S1110866518302524
id doaj-eb4a8bab86aa4560ba4644781f648010
record_format Article
spelling doaj-eb4a8bab86aa4560ba4644781f6480102021-07-02T10:30:51ZengElsevierEgyptian Informatics Journal1110-86652019-11-01203151161Ina-BWR: Indonesian bigram word rule for multi-label student complaintsTora Fahrudin0Joko Lianto Buliali1Chastine Fatichah2Department of Informatics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia; School of Applied Science, Telkom University, Bandung 40257, Indonesia; Corresponding author.Department of Informatics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, IndonesiaDepartment of Informatics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, IndonesiaHandling multi-label student complaints is one of interesting research topics. One of techniques used for handling multi-label student complaints is Bag of Word (BoW) method. In this research bigram word rule and preprocess are proposed to increase the accuracy of multi-label classification results. To show the effectiveness of the proposed method, data from Telkom University student data and additional relevant data by using hashtag are used as testing data. We develop Indonesian Bigram Word Rule for Multi-label Student Complaints (Ina-BWR) to identify multi-label student problems based on Bigram Word Rule. Ina-BWR consists of three processes such as preprocessing informal text, identifying complaint and object from text. Additional preprocessing techniques are conducted to formalize the text such as parsing a hashtag, correcting affixes word, correcting a conjunction word, parsing suffix people pronoun and correcting typo words. Indonesian bigram word rule is adopted from opinion identification rules with 3 additional corpuses (-)NN, (-)JJ and (-)VB to identify student complaints. To identify complaints, four label corpuses have been created manually. The experimental results show that Ina-BWR can increase Personal, Subject and Relation label accuracies. The best accuracy for four labels is obtained when Ina-BWR is combined with BoW method. Keywords: Multi-label, Student complaints, Bag of word, Indonesian bigram word rule, Opinion identification ruleshttp://www.sciencedirect.com/science/article/pii/S1110866518302524
collection DOAJ
language English
format Article
sources DOAJ
author Tora Fahrudin
Joko Lianto Buliali
Chastine Fatichah
spellingShingle Tora Fahrudin
Joko Lianto Buliali
Chastine Fatichah
Ina-BWR: Indonesian bigram word rule for multi-label student complaints
Egyptian Informatics Journal
author_facet Tora Fahrudin
Joko Lianto Buliali
Chastine Fatichah
author_sort Tora Fahrudin
title Ina-BWR: Indonesian bigram word rule for multi-label student complaints
title_short Ina-BWR: Indonesian bigram word rule for multi-label student complaints
title_full Ina-BWR: Indonesian bigram word rule for multi-label student complaints
title_fullStr Ina-BWR: Indonesian bigram word rule for multi-label student complaints
title_full_unstemmed Ina-BWR: Indonesian bigram word rule for multi-label student complaints
title_sort ina-bwr: indonesian bigram word rule for multi-label student complaints
publisher Elsevier
series Egyptian Informatics Journal
issn 1110-8665
publishDate 2019-11-01
description Handling multi-label student complaints is one of interesting research topics. One of techniques used for handling multi-label student complaints is Bag of Word (BoW) method. In this research bigram word rule and preprocess are proposed to increase the accuracy of multi-label classification results. To show the effectiveness of the proposed method, data from Telkom University student data and additional relevant data by using hashtag are used as testing data. We develop Indonesian Bigram Word Rule for Multi-label Student Complaints (Ina-BWR) to identify multi-label student problems based on Bigram Word Rule. Ina-BWR consists of three processes such as preprocessing informal text, identifying complaint and object from text. Additional preprocessing techniques are conducted to formalize the text such as parsing a hashtag, correcting affixes word, correcting a conjunction word, parsing suffix people pronoun and correcting typo words. Indonesian bigram word rule is adopted from opinion identification rules with 3 additional corpuses (-)NN, (-)JJ and (-)VB to identify student complaints. To identify complaints, four label corpuses have been created manually. The experimental results show that Ina-BWR can increase Personal, Subject and Relation label accuracies. The best accuracy for four labels is obtained when Ina-BWR is combined with BoW method. Keywords: Multi-label, Student complaints, Bag of word, Indonesian bigram word rule, Opinion identification rules
url http://www.sciencedirect.com/science/article/pii/S1110866518302524
work_keys_str_mv AT torafahrudin inabwrindonesianbigramwordruleformultilabelstudentcomplaints
AT jokoliantobuliali inabwrindonesianbigramwordruleformultilabelstudentcomplaints
AT chastinefatichah inabwrindonesianbigramwordruleformultilabelstudentcomplaints
_version_ 1721331953352835072