The Challenges of Data Quality and Data Quality Assessment in the Big Data Era

High-quality data are the precondition for analyzing and using big data and for guaranteeing the value of the data. Currently, comprehensive analysis and research of quality standards and quality assessment methods for big data are lacking. First, this paper summarizes reviews of data quality resear...

Full description

Bibliographic Details
Main Authors: Li Cai, Yangyong Zhu
Format: Article
Language:English
Published: Ubiquity Press 2015-05-01
Series:Data Science Journal
Online Access:http://datascience.codata.org/articles/553
id doaj-15d909ea1f1645ecb132f49ea2c29350
record_format Article
spelling doaj-15d909ea1f1645ecb132f49ea2c293502020-11-24T22:48:56ZengUbiquity PressData Science Journal1683-14702015-05-011410.5334/dsj-2015-002568The Challenges of Data Quality and Data Quality Assessment in the Big Data EraLi Cai0Yangyong Zhu1School of Computer and Science, Fudan University, No. 220, Han Dan Road, Shanghai School of Software, Yunnan University, No. 2 North Road of Cui Hu, KunmingShanghai Key Laboratory of Data Science, Fudan University, ShanghaiHigh-quality data are the precondition for analyzing and using big data and for guaranteeing the value of the data. Currently, comprehensive analysis and research of quality standards and quality assessment methods for big data are lacking. First, this paper summarizes reviews of data quality research. Second, this paper analyzes the data characteristics of the big data environment, presents quality challenges faced by big data, and formulates a hierarchical data quality framework from the perspective of data users. This framework consists of big data quality dimensions, quality characteristics, and quality indexes. Finally, on the basis of this framework, this paper constructs a dynamic assessment process for data quality. This process has good expansibility and adaptability and can meet the needs of big data quality assessment. The research results enrich the theoretical scope of big data and lay a solid foundation for the future by establishing an assessment model and studying evaluation algorithms.http://datascience.codata.org/articles/553
collection DOAJ
language English
format Article
sources DOAJ
author Li Cai
Yangyong Zhu
spellingShingle Li Cai
Yangyong Zhu
The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
Data Science Journal
author_facet Li Cai
Yangyong Zhu
author_sort Li Cai
title The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
title_short The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
title_full The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
title_fullStr The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
title_full_unstemmed The Challenges of Data Quality and Data Quality Assessment in the Big Data Era
title_sort challenges of data quality and data quality assessment in the big data era
publisher Ubiquity Press
series Data Science Journal
issn 1683-1470
publishDate 2015-05-01
description High-quality data are the precondition for analyzing and using big data and for guaranteeing the value of the data. Currently, comprehensive analysis and research of quality standards and quality assessment methods for big data are lacking. First, this paper summarizes reviews of data quality research. Second, this paper analyzes the data characteristics of the big data environment, presents quality challenges faced by big data, and formulates a hierarchical data quality framework from the perspective of data users. This framework consists of big data quality dimensions, quality characteristics, and quality indexes. Finally, on the basis of this framework, this paper constructs a dynamic assessment process for data quality. This process has good expansibility and adaptability and can meet the needs of big data quality assessment. The research results enrich the theoretical scope of big data and lay a solid foundation for the future by establishing an assessment model and studying evaluation algorithms.
url http://datascience.codata.org/articles/553
work_keys_str_mv AT licai thechallengesofdataqualityanddataqualityassessmentinthebigdataera
AT yangyongzhu thechallengesofdataqualityanddataqualityassessmentinthebigdataera
AT licai challengesofdataqualityanddataqualityassessmentinthebigdataera
AT yangyongzhu challengesofdataqualityanddataqualityassessmentinthebigdataera
_version_ 1725678065235787776