Duplicate Detection with PMC -- A Parallel Approach to Pattern Matching
Fuzzy duplicate detection is an integral part of data cleansing. It consists of finding a set of duplicate records, correctly identifying the original or most representative record and removing the rest. The rate of Internet usage, and data availability and collectability is increasing so we get mor...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap
2007
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9642 |