Targeted s-gram matching: a novel n-gram matching technique for cross- and monolingual word form variants
We present a novel n-gram based string matching technique, which we call the targeted s-gram matching technique. In the technique, n-grams are classified into categories on the basis of character contiguity in words. The categories are then utilized in matching. The technique was compared with the c...
Main Authors: | Ari Pirkola, Heikki Keskustalo, Erkka Leppänen, Antti-Pekka Känsälä, Kalervo Järvelin |
---|---|
Format: | Article |
Language: | English |
Published: |
University of Borås
2002-01-01
|
Series: | Information Research: An International Electronic Journal |
Online Access: | http://informationr.net/ir/7-2/paper126.html |
Similar Items
-
The RATF Formula (Kwok's Formula): Exploiting Average Term Frequency in Cross-Language Retrieval
by: Ari Pirkola, et al.
Published: (2002-01-01) -
MiNgMatch—A Fast N-gram Model for Word Segmentation of the Ainu Language
by: Karol Nowakowski, et al.
Published: (2019-10-01) -
APPLYING A Q-GRAM BASED MULTIPLE STRING MATCHING ALGORITHM FOR APPROXIMATE MATCHING
by: Robert Susik
Published: (2017-09-01) -
Stemming and N-gram matching for term conflation in Turkish texts
by: F. Çuna Ekmekçioglu, et al.
Published: (1996-01-01) -
The subjective frequency of word n-grams
by: Shaoul Cyrus, et al.
Published: (2013-01-01)