Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure

Levenshtein is a Minimum Edit Distance method; it is usually used in spell checking applications for generating candidates. The method computes the number of the required edit operations to transform one string to another and it can recognize three types of edit operations: deletion, insertion, and...

Full description

Bibliographic Details
Main Authors: Abbas Al-Bakry, Marwa Al-Rikaby
Format: Article
Language:Arabic
Published: University of Information Technology and Communications 2016-12-01
Series:Iraqi Journal for Computers and Informatics
Subjects:
Online Access:http://ijci.uoitc.edu.iq/index.php/ijci/article/view/83
id doaj-f9271d943ca6436bad09f590b39e7d0b
record_format Article
spelling doaj-f9271d943ca6436bad09f590b39e7d0b2020-11-25T03:11:48ZaraUniversity of Information Technology and CommunicationsIraqi Journal for Computers and Informatics2313-190X2520-49122016-12-01421485410.25195/ijci.v42i1.8383Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity MeasureAbbas Al-Bakry0Marwa Al-Rikaby1 University of Information Technology and Communications (UOITC)Babylon UniversityLevenshtein is a Minimum Edit Distance method; it is usually used in spell checking applications for generating candidates. The method computes the number of the required edit operations to transform one string to another and it can recognize three types of edit operations: deletion, insertion, and substitution of one letter. Damerau modified the Levenshtein method to consider another type of edit operations, the transposition of two adjacent letters, in addition to the considered three types. However, the modification suffers from the time complexity which was added to the original quadratic time complexity of the original method. In this paper, we proposed a modification for the original Levenshtein to consider the same four types using very small number of matching operations which resulted in a shorter execution time and a similarity measure is also achieved to exploit the resulted distance from any Edit Distance method for finding the amount of similarity between two given strings.http://ijci.uoitc.edu.iq/index.php/ijci/article/view/83minimum edit distance, similarity, levenshtein method, damerau's errors types.
collection DOAJ
language Arabic
format Article
sources DOAJ
author Abbas Al-Bakry
Marwa Al-Rikaby
spellingShingle Abbas Al-Bakry
Marwa Al-Rikaby
Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
Iraqi Journal for Computers and Informatics
minimum edit distance, similarity, levenshtein method, damerau's errors types.
author_facet Abbas Al-Bakry
Marwa Al-Rikaby
author_sort Abbas Al-Bakry
title Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
title_short Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
title_full Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
title_fullStr Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
title_full_unstemmed Enhanced Levenshtein Edit Distance Method functioning as a String-to-String Similarity Measure
title_sort enhanced levenshtein edit distance method functioning as a string-to-string similarity measure
publisher University of Information Technology and Communications
series Iraqi Journal for Computers and Informatics
issn 2313-190X
2520-4912
publishDate 2016-12-01
description Levenshtein is a Minimum Edit Distance method; it is usually used in spell checking applications for generating candidates. The method computes the number of the required edit operations to transform one string to another and it can recognize three types of edit operations: deletion, insertion, and substitution of one letter. Damerau modified the Levenshtein method to consider another type of edit operations, the transposition of two adjacent letters, in addition to the considered three types. However, the modification suffers from the time complexity which was added to the original quadratic time complexity of the original method. In this paper, we proposed a modification for the original Levenshtein to consider the same four types using very small number of matching operations which resulted in a shorter execution time and a similarity measure is also achieved to exploit the resulted distance from any Edit Distance method for finding the amount of similarity between two given strings.
topic minimum edit distance, similarity, levenshtein method, damerau's errors types.
url http://ijci.uoitc.edu.iq/index.php/ijci/article/view/83
work_keys_str_mv AT abbasalbakry enhancedlevenshteineditdistancemethodfunctioningasastringtostringsimilaritymeasure
AT marwaalrikaby enhancedlevenshteineditdistancemethodfunctioningasastringtostringsimilaritymeasure
_version_ 1724652919031070720