Reducing defects in the datasets of clinical research studies: conformance with data quality metrics

Abstract Background A dataset is indispensable to answer the research questions of clinical research studies. Inaccurate data lead to ambiguous results, and the removal of errors results in increased cost. The aim of this Quality Improvement Project (QIP) was to improve the Data Quality (DQ) by enha...

Full description

Bibliographic Details
Main Authors: Naila A. Shaheen, Bipin Manezhi, Abin Thomas, Mohammed AlKelya
Format: Article
Language:English
Published: BMC 2019-05-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12874-019-0735-7
id doaj-6881eaa183004b5eb542587160357f58
record_format Article
spelling doaj-6881eaa183004b5eb542587160357f582020-11-25T02:58:13ZengBMCBMC Medical Research Methodology1471-22882019-05-011911810.1186/s12874-019-0735-7Reducing defects in the datasets of clinical research studies: conformance with data quality metricsNaila A. Shaheen0Bipin Manezhi1Abin Thomas2Mohammed AlKelya3Department of Biostatistics and Bioinformatics, King Abdullah International Medical Research CenterPublic Health Division, Central Australian Aboriginal CongressDepartment of Biostatistics and Bioinformatics, King Abdullah International Medical Research CenterResearch Quality Management Section, King Abdullah International Medical Research CenterAbstract Background A dataset is indispensable to answer the research questions of clinical research studies. Inaccurate data lead to ambiguous results, and the removal of errors results in increased cost. The aim of this Quality Improvement Project (QIP) was to improve the Data Quality (DQ) by enhancing conformance and minimizing data entry errors. Methods This is a QIP which was conducted in the Department of Biostatistics using historical datasets submitted for statistical data analysis from the department’s knowledge base system. Forty-five datasets received for statistical data analysis, were included at baseline. A 12-item checklist based on six DQ domains (i) completeness (ii) uniqueness (iii) timeliness (iv) accuracy (v) validity and (vi) consistency was developed to assess the DQ. The checklist was comprised of 12 items; missing values, un-coded values, miscoded values, embedded values, implausible values, unformatted values, missing codebook, inconsistencies with the codebook, inaccurate format, unanalyzable data structure, missing outcome variables, and missing analytic variables. The outcome was the number of defects per dataset. Quality improvement DMAIC (Define, Measure, Analyze, Improve, Control) framework and sigma improvement tools were used. Pre-Post design was implemented using mode of interventions. Pre-Post change in defects (zero, one, two or more defects) was compared by using chi-square test. Results At baseline, out of forty-five datasets; six (13.3%) datasets had zero defects, eight (17.8%) had one defect, and 31(69%) had ≥2 defects. The association between the nature of data capture (single vs. multiple data points) and defective data was statistically significant (p = 0.008). Twenty-one datasets were received during post-intervention for statistical data analysis. Seventeen (81%) had zero defects, two (9.5%) had one defect, and two (9.5%) had two or more defects. The proportion of datasets with zero defects had increased from 13.3 to 81%, whereas the proportion of datasets with two or more defects had decreased from 69 to 9.5% (p = < 0.001). Conclusion Clinical research study teams often have limited knowledge of data structuring. Given the need for good quality data, we recommend training programs, consultation with data experts prior to data structuring and use of electronic data capturing methods.http://link.springer.com/article/10.1186/s12874-019-0735-7Defective datasetData entry errorsClinical research data qualityData quality metricsPoor-quality datasetData quality management
collection DOAJ
language English
format Article
sources DOAJ
author Naila A. Shaheen
Bipin Manezhi
Abin Thomas
Mohammed AlKelya
spellingShingle Naila A. Shaheen
Bipin Manezhi
Abin Thomas
Mohammed AlKelya
Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
BMC Medical Research Methodology
Defective dataset
Data entry errors
Clinical research data quality
Data quality metrics
Poor-quality dataset
Data quality management
author_facet Naila A. Shaheen
Bipin Manezhi
Abin Thomas
Mohammed AlKelya
author_sort Naila A. Shaheen
title Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
title_short Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
title_full Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
title_fullStr Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
title_full_unstemmed Reducing defects in the datasets of clinical research studies: conformance with data quality metrics
title_sort reducing defects in the datasets of clinical research studies: conformance with data quality metrics
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2019-05-01
description Abstract Background A dataset is indispensable to answer the research questions of clinical research studies. Inaccurate data lead to ambiguous results, and the removal of errors results in increased cost. The aim of this Quality Improvement Project (QIP) was to improve the Data Quality (DQ) by enhancing conformance and minimizing data entry errors. Methods This is a QIP which was conducted in the Department of Biostatistics using historical datasets submitted for statistical data analysis from the department’s knowledge base system. Forty-five datasets received for statistical data analysis, were included at baseline. A 12-item checklist based on six DQ domains (i) completeness (ii) uniqueness (iii) timeliness (iv) accuracy (v) validity and (vi) consistency was developed to assess the DQ. The checklist was comprised of 12 items; missing values, un-coded values, miscoded values, embedded values, implausible values, unformatted values, missing codebook, inconsistencies with the codebook, inaccurate format, unanalyzable data structure, missing outcome variables, and missing analytic variables. The outcome was the number of defects per dataset. Quality improvement DMAIC (Define, Measure, Analyze, Improve, Control) framework and sigma improvement tools were used. Pre-Post design was implemented using mode of interventions. Pre-Post change in defects (zero, one, two or more defects) was compared by using chi-square test. Results At baseline, out of forty-five datasets; six (13.3%) datasets had zero defects, eight (17.8%) had one defect, and 31(69%) had ≥2 defects. The association between the nature of data capture (single vs. multiple data points) and defective data was statistically significant (p = 0.008). Twenty-one datasets were received during post-intervention for statistical data analysis. Seventeen (81%) had zero defects, two (9.5%) had one defect, and two (9.5%) had two or more defects. The proportion of datasets with zero defects had increased from 13.3 to 81%, whereas the proportion of datasets with two or more defects had decreased from 69 to 9.5% (p = < 0.001). Conclusion Clinical research study teams often have limited knowledge of data structuring. Given the need for good quality data, we recommend training programs, consultation with data experts prior to data structuring and use of electronic data capturing methods.
topic Defective dataset
Data entry errors
Clinical research data quality
Data quality metrics
Poor-quality dataset
Data quality management
url http://link.springer.com/article/10.1186/s12874-019-0735-7
work_keys_str_mv AT nailaashaheen reducingdefectsinthedatasetsofclinicalresearchstudiesconformancewithdataqualitymetrics
AT bipinmanezhi reducingdefectsinthedatasetsofclinicalresearchstudiesconformancewithdataqualitymetrics
AT abinthomas reducingdefectsinthedatasetsofclinicalresearchstudiesconformancewithdataqualitymetrics
AT mohammedalkelya reducingdefectsinthedatasetsofclinicalresearchstudiesconformancewithdataqualitymetrics
_version_ 1724707820299878400