Methodology for fuzzy duplicate record identification based on the semantic-syntactic information of similarity
There are different methodologies for identification of fuzzy duplicate records in the process of data cleaning for data warehouse and data mining. The methodologies for duplicate record identification can be classified into three groups: blocking methods, windowing methods, and semantic methods. Th...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-01-01
|
Series: | Journal of King Saud University: Computer and Information Sciences |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1319157817304512 |