Rule-based Approach on Extraction of Malay Compound Nouns in Standard Malay Document

Malay compound noun is defined as a form of words that exists when two or more words are combined into a single syntax and it gives a specific meaning. Compound noun acts as one unit and it is spelled separately unless an established compound noun is written closely from two words. The basic charact...

Full description

Bibliographic Details
Main Authors:	Bakar, Z.A (Author), Ismail, N.K (Author), Rawi, M.I.M (Author)
Format:	Article
Language:	English
Published:	Institute of Physics Publishing 2017
Subjects:	Basic characteristics Extracting compounds Extraction Linguistic approach Linguistics Machine translations Part of speech tagging Pre-processing step Relevant relations Rule-based approach
Online Access:	View Fulltext in Publisher View in Scopus


LEADER	03128nas a2200277Ia 4500
001	10.1088-1757-899X-226-1-012106
008	220120c20179999CNT?? ? 0 0und d
020			\|a 17578981 (ISSN)
245	1	0	\|a Rule-based Approach on Extraction of Malay Compound Nouns in Standard Malay Document
260		0	\|b Institute of Physics Publishing \|c 2017
520	3		\|a Malay compound noun is defined as a form of words that exists when two or more words are combined into a single syntax and it gives a specific meaning. Compound noun acts as one unit and it is spelled separately unless an established compound noun is written closely from two words. The basic characteristics of compound noun can be seen in the Malay sentences which are the frequency of that word in the text itself. Thus, this extraction of compound nouns is significant for the following research which is text summarization, grammar checker, sentiments analysis, machine translation and word categorization. There are many research efforts that have been proposed in extracting Malay compound noun using linguistic approaches. Most of the existing methods were done on the extraction of bi-gram noun+noun compound. However, the result still produces some problems as to give a better result. This paper explores a linguistic method for extracting compound Noun from stand Malay corpus. A standard dataset are used to provide a common platform for evaluating research on the recognition of compound Nouns in Malay sentences. Therefore, an improvement for the effectiveness of the compound noun extraction is needed because the result can be compromised. Thus, this study proposed a modification of linguistic approach in order to enhance the extraction of compound nouns processing. Several pre-processing steps are involved including normalization, tokenization and tagging. The first step that uses the linguistic approach in this study is Part-of-Speech (POS) tagging. Finally, we describe several rules-based and modify the rules to get the most relevant relation between the first word and the second word in order to assist us in solving of the problems. The effectiveness of the relations used in our study can be measured using recall, precision and F1-score techniques. The comparison of the baseline values is very essential because it can provide whether there has been an improvement in the result. © Published under licence by IOP Publishing Ltd.
650	0	4	\|a Basic characteristics
650	0	4	\|a Extracting compounds
650	0	4	\|a Extraction
650	0	4	\|a Linguistic approach
650	0	4	\|a Linguistics
650	0	4	\|a Machine translations
650	0	4	\|a Part of speech tagging
650	0	4	\|a Pre-processing step
650	0	4	\|a Relevant relations
650	0	4	\|a Rule-based approach
700	1	0	\|a Bakar, Z.A. \|e author
700	1	0	\|a Ismail, N.K. \|e author
700	1	0	\|a Rawi, M.I.M. \|e author
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1088/1757-899X/226/1/012106
856			\|z View in Scopus \|u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85028680673&doi=10.1088%2f1757-899X%2f226%2f1%2f012106&partnerID=40&md5=88dbf13ae59aadb278c78908876af079

Rule-based Approach on Extraction of Malay Compound Nouns in Standard Malay Document

Similar Items