Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates

<p>Abstract</p> <p>Background</p> <p>As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no ma...

Full description

Bibliographic Details
Main Authors: Berhane Yemane, Byass Peter, Fottrell Edward
Format: Article
Language:English
Published: BMC 2008-03-01
Series:BMC Medical Research Methodology
Online Access:http://www.biomedcentral.com/1471-2288/8/13
id doaj-bf701122f4a044c0b37531e4ec5be262
record_format Article
spelling doaj-bf701122f4a044c0b37531e4ec5be2622020-11-25T00:57:19ZengBMCBMC Medical Research Methodology1471-22882008-03-01811310.1186/1471-2288-8-13Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimatesBerhane YemaneByass PeterFottrell Edward<p>Abstract</p> <p>Background</p> <p>As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity.</p> <p>Methods</p> <p>This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality.</p> <p>Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data.</p> <p>Results</p> <p>The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data.</p> <p>Conclusion</p> <p>The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings.</p> http://www.biomedcentral.com/1471-2288/8/13
collection DOAJ
language English
format Article
sources DOAJ
author Berhane Yemane
Byass Peter
Fottrell Edward
spellingShingle Berhane Yemane
Byass Peter
Fottrell Edward
Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
BMC Medical Research Methodology
author_facet Berhane Yemane
Byass Peter
Fottrell Edward
author_sort Berhane Yemane
title Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
title_short Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
title_full Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
title_fullStr Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
title_full_unstemmed Demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
title_sort demonstrating the robustness of population surveillance data: implications of error rates on demographic and mortality estimates
publisher BMC
series BMC Medical Research Methodology
issn 1471-2288
publishDate 2008-03-01
description <p>Abstract</p> <p>Background</p> <p>As in any measurement process, a certain amount of error may be expected in routine population surveillance operations such as those in demographic surveillance sites (DSSs). Vital events are likely to be missed and errors made no matter what method of data capture is used or what quality control procedures are in place. The extent to which random errors in large, longitudinal datasets affect overall health and demographic profiles has important implications for the role of DSSs as platforms for public health research and clinical trials. Such knowledge is also of particular importance if the outputs of DSSs are to be extrapolated and aggregated with realistic margins of error and validity.</p> <p>Methods</p> <p>This study uses the first 10-year dataset from the Butajira Rural Health Project (BRHP) DSS, Ethiopia, covering approximately 336,000 person-years of data. Simple programmes were written to introduce random errors and omissions into new versions of the definitive 10-year Butajira dataset. Key parameters of sex, age, death, literacy and roof material (an indicator of poverty) were selected for the introduction of errors based on their obvious importance in demographic and health surveillance and their established significant associations with mortality.</p> <p>Defining the original 10-year dataset as the 'gold standard' for the purposes of this investigation, population, age and sex compositions and Poisson regression models of mortality rate ratios were compared between each of the intentionally erroneous datasets and the original 'gold standard' 10-year data.</p> <p>Results</p> <p>The composition of the Butajira population was well represented despite introducing random errors, and differences between population pyramids based on the derived datasets were subtle. Regression analyses of well-established mortality risk factors were largely unaffected even by relatively high levels of random errors in the data.</p> <p>Conclusion</p> <p>The low sensitivity of parameter estimates and regression analyses to significant amounts of randomly introduced errors indicates a high level of robustness of the dataset. This apparent inertia of population parameter estimates to simulated errors is largely due to the size of the dataset. Tolerable margins of random error in DSS data may exceed 20%. While this is not an argument in favour of poor quality data, reducing the time and valuable resources spent on detecting and correcting random errors in routine DSS operations may be justifiable as the returns from such procedures diminish with increasing overall accuracy. The money and effort currently spent on endlessly correcting DSS datasets would perhaps be better spent on increasing the surveillance population size and geographic spread of DSSs and analysing and disseminating research findings.</p>
url http://www.biomedcentral.com/1471-2288/8/13
work_keys_str_mv AT berhaneyemane demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates
AT byasspeter demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates
AT fottrelledward demonstratingtherobustnessofpopulationsurveillancedataimplicationsoferrorratesondemographicandmortalityestimates
_version_ 1725224709694423040