Summary: | Abstract Background Big data research is important for studying uncommon diseases in real-world settings. Most big data studies in axial spondyloarthritis (axSpA) have been limited to populations identified with billing codes for ankylosing spondylitis (AS). axSpA is a more inclusive concept, and reliance on AS codes does not produce a comprehensive axSpA study population. The first objective was to describe our process for establishing an appropriate sample of patients with and without axSpA for developing accurate axSpA identification methods. The second objective was to determine the classification performance of AS billing codes against the chart-reviewed axSpA reference standard. Methods Veteran Health Affairs clinical and administrative data, between January 2005 and June 2015, were used to randomly select patients with clinical phenotypes that represented high, moderate, and low likelihoods of an axSpA diagnosis. With chart review, the sampled patients were classified as Yes axSpA, No axSpA or Uncertain axSpA, and these classification assignments were used as the reference standard for determining the positive predictive value (PPV) and sensitivity of AS ICD-9 codes for axSpA. Results Six hundred patients were classified as Yes axSpA (26.8%), No axSpA (68.3%), or Uncertain axSpA (4.8%). The PPV and sensitivity of an AS ICD-9 code for axSpA were 83.3% and 57.3%, respectively. Conclusions Standard methods of identifying axSpA patients in a large dataset lacked sensitivity. An appropriate sample of patients with and without axSpA was established and characterized for developing novel axSpA identification methods that are anticipated to enable previously impractical big data research.
|