Data De-Duplication through Active Learning

Data De-Duplication through Active Learning

Data de-duplication concerns the identification and eventual elimination of records, in a particular dataset, that refer to the same entity without necessarily having the same attribute values, nor the same identifying values. Machine Learning techniques have been used to handle data de-duplication....

Full description

Bibliographic Details
Main Author:	Muhivuwomunda, Divine
Format:	Others
Language:	en
Published:	University of Ottawa (Canada) 2013
Subjects:	Computer Science.
Online Access:	http://hdl.handle.net/10393/28859 http://dx.doi.org/10.20381/ruor-19478

Similar Items

Space and time scalability of duplicate detection in graph data
by: Herschel, Melanie, et al.
Published: (2008)

Active duplicate detection with Bayesian nonparametric models
by: Matsakis, Nicholas E. (Nicholas Elias), 1976-
Published: (2010)

Cloud De-Duplication Cost Model
by: Hocker, Christopher
Published: (2012)

Learning from multirelational data through multiple views
by: Guo, Hongyu
Published: (2013)

Visualizing and Understanding Code Duplication in Large Software Systems
by: Jiang, Zhen Ming
Published: (2006)

Visualizing and Understanding Code Duplication in Large Software Systems
by: Jiang, Zhen Ming
Published: (2006)

Efficient near duplicate document detection for specialized corpora
by: Seshasai, Shreyes
Published: (2010)

A secure steganographic file system with non-duplicating properties
by: Ellefsen, Ian David
Published: (2012)

The Impact of Near-Duplicate Documents on Information Retrieval Evaluation
by: Khoshdel Nikkhoo, Hani
Published: (2011)

The Impact of Near-Duplicate Documents on Information Retrieval Evaluation
by: Khoshdel Nikkhoo, Hani
Published: (2011)

Adaptive windows for duplicate detection
by: Draisbach, Uwe, et al.
Published: (2012)

A regularization framework for active learning from imbalanced data
by: Paskov, Hristo Spassimirov
Published: (2011)

Classifying Everyday Activity Through Label Propagation With Sparse Training Data
Published: (2013)

Modeling, Designing, and Implementing an Ad-hoc M-Learning Platform that Integrates Sensory Data to Support Ubiquitous Learning
by: Nguyen, Hien M.
Published: (2015)

Duplication et cohérence configurables dans les applications réparties à base de composants
by: Marangozova, Vania
Published: (2003)

The spatial learning method : facilitation of learning through the use of cognitive mapping in virtual reality
by: Johns, Cathryn
Published: (2014)

Interpreting human activity from electrical consumption data through non-intrusive load monitoring
by: Gillman, Mark Daniel
Published: (2014)

Language Learning Through Comparison
by: Babarsad, Omid Bakhshandeh
Published: (2017)

Learning through looking and listening
by: Recasens Continente, Adriá.
Published: (2020)

Enhancing Availability of Crucial Cloud Data through Automatic Duplication
by: TU, YI-MING, et al.
Published: (2018)

Probabilistic and Deep Learning Algorithms for the Analysis of Imagery Data
by: Basu, Saikat
Published: (2016)

Efficient Incremental Model Learning on Data Streams
Published: (2019)

Authentic Learning in Engineering Technology Through the use of a Technology and Learning Matrix Based Curriculum
by: Green, Darrell W.
Published: (1995)

Scalable knowledge acquisition through cumulative learning and memory organization
by: Stracuzzi, David J
Published: (2006)

Learning spoken language through vision
by: Harwath, David F. (David Frank)
Published: (2018)

On practical machine learning and data analysis
by: Gillblad, Daniel
Published: (2008)

Automated Learning Of Health Behaviors Through Consumer Authored Natural Language Text
by: Yin, Zhijun
Published: (2018)

Online learning for imbalanced data: optimizing asymmetric measures
by: Zhang, Xiaoxuan
Published: (2018)

Statistical models and analysis techniques for learning in relational data
by: Neville, Jennifer
Published: (2006)

Integration of screencast video through qr code: an effective learning material for m-learning
by: Yahya, Faridah Hanim, et al.
Published: (2018)

Fibroepithelial Polyp in a Duplicated Ureter
by: Jae Young Lee, et al.
Published: (2020-09-01)

Deep learning and structured data
by: Zhang, Chiyuan, Ph. D. Massachusetts Institute of Technology
Published: (2018)

Generating synthetic data through Hidden Markov Models
by: Ferrando Huertas, Jaime
Published: (2018)

Secure biometric authentication with de-duplication on distributed cloud storage
by: Vinoth Kumar M, et al.
Published: (2021-07-01)

Active learning in partially observable Markov decision processes
by: Jaulmes, Robin.
Published: (2006)

Machine Learning Algorithms and Applications for Lidar, Images, and Unstructured Data

Word sense disambiguation through lattice learning
by: Stickgold, Eli (Eli B.)
Published: (2011)

Utilization Review of Duplicate Prescription Indicators Through the Cloud Medication Data
by: Chia-Ling Tang, et al.

Characterization of Data Locality Potential of CPU and GPU Applications through Dynamic Analysis
by: Fauzia, Naznin
Published: (2015)

Implementation of an online collaborative learning through grid portal technology
by: Zakaria, Nur Liyana
Published: (2010)