Learning-Based Fusion for Data Deduplication: A Robust and Automated Solution

This thesis presents two deduplication techniques that overcome the following critical and long-standing weaknesses of rule-based deduplication: (1) traditional rule-based deduplication requires significant manual tuning of the individual rules, including the selection of appropriate thresholds; (2)...

Full description

Bibliographic Details
Main Author: Dinerstein, Jared
Format: Others
Published: DigitalCommons@USU 2010
Subjects:
svm
Online Access:https://digitalcommons.usu.edu/etd/787
https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1783&context=etd