Simultaneous Removal of Prefix and Suffix
This work is an attempt to devise a Stemmer that can remove both prefix and suffix together from a given word in English language. For a given input word, our method considers all possible internal N-grams for detection of potential stems. We frame a hypothesis where the stem length is closest to th...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
World Scientific Publishing
2020-05-01
|
Series: | Vietnam Journal of Computer Science |
Subjects: | |
Online Access: | http://www.worldscientific.com/doi/pdf/10.1142/S2196888820500074 |
Summary: | This work is an attempt to devise a Stemmer that can remove both prefix and suffix together from a given word in English language. For a given input word, our method considers all possible internal N-grams for detection of potential stems. We frame a hypothesis where the stem length is closest to the half of the length of the input word. A standard English dictionary has been employed to identify morphologically correct N-grams in the process. We apply our techniques over a random sample of 100 English words, each possessing both prefix and suffix. We also compare our proposed Stemmer with three standard algorithms from the literature. Empirical results exhibit that our technique performs better than the rest of the stemmers. |
---|---|
ISSN: | 2196-8888 2196-8896 |