Improve Data Quality By Using Dependencies And Regular Expressions

The objective of this study has been to answer the question of finding ways to improve the quality of database. There exists a lot of problems of the data stored in the database, like missing or spelling errors. To deal with the dirty data in the database, this study adopts the conditional functiona...

Full description

Bibliographic Details
Main Author: Feng, Yuan
Format: Others
Language:English
Published: Mittuniversitetet, Avdelningen för informationssystem och -teknologi 2018
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-35620
Description
Summary:The objective of this study has been to answer the question of finding ways to improve the quality of database. There exists a lot of problems of the data stored in the database, like missing or spelling errors. To deal with the dirty data in the database, this study adopts the conditional functional dependencies and regular expressions to detect and correct data. Based on the former studies of data cleaning methods, this study considers the more complex conditions of database and combines the efficient algorithms to deal with the data. The study shows that by using these methods, the database’s quality can be improved and considering the complexity of time and space, there still has a lot of things to do to make the data cleaning process more efficiency.