Characterisation, identification, clustering, and classification of disease

Abstract The importance of quantifying the distribution and determinants of multimorbidity has prompted novel data-driven classifications of disease. Applications have included improved statistical power and refined prognoses for a range of respiratory, infectious, autoimmune, and neurological disea...

Full description

Bibliographic Details
Main Authors: A. J. Webster, K. Gaitskell, I. Turnbull, B. J. Cairns, R. Clarke
Format: Article
Language:English
Published: Nature Publishing Group 2021-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-84860-z
id doaj-23dddf77bfbd46098aa99463ae48cafd
record_format Article
spelling doaj-23dddf77bfbd46098aa99463ae48cafd2021-03-11T12:13:49ZengNature Publishing GroupScientific Reports2045-23222021-03-0111111310.1038/s41598-021-84860-zCharacterisation, identification, clustering, and classification of diseaseA. J. Webster0K. Gaitskell1I. Turnbull2B. J. Cairns3R. Clarke4Nuffield Department of Population Health, University of OxfordNuffield Department of Population Health, University of OxfordNuffield Department of Population Health, University of OxfordNuffield Department of Population Health, University of OxfordNuffield Department of Population Health, University of OxfordAbstract The importance of quantifying the distribution and determinants of multimorbidity has prompted novel data-driven classifications of disease. Applications have included improved statistical power and refined prognoses for a range of respiratory, infectious, autoimmune, and neurological diseases, with studies using molecular information, age of disease incidence, and sequences of disease onset (“disease trajectories”) to classify disease clusters. Here we consider whether easily measured risk factors such as height and BMI can effectively characterise diseases in UK Biobank data, combining established statistical methods in new but rigorous ways to provide clinically relevant comparisons and clusters of disease. Over 400 common diseases were selected for analysis using clinical and epidemiological criteria, and conventional proportional hazards models were used to estimate associations with 12 established risk factors. Several diseases had strongly sex-dependent associations of disease risk with BMI. Importantly, a large proportion of diseases affecting both sexes could be identified by their risk factors, and equivalent diseases tended to cluster adjacently. These included 10 diseases presently classified as “Symptoms, signs, and abnormal clinical and laboratory findings, not elsewhere classified”. Many clusters are associated with a shared, known pathogenesis, others suggest likely but presently unconfirmed causes. The specificity of associations and shared pathogenesis of many clustered diseases provide a new perspective on the interactions between biological pathways, risk factors, and patterns of disease such as multimorbidity.https://doi.org/10.1038/s41598-021-84860-z
collection DOAJ
language English
format Article
sources DOAJ
author A. J. Webster
K. Gaitskell
I. Turnbull
B. J. Cairns
R. Clarke
spellingShingle A. J. Webster
K. Gaitskell
I. Turnbull
B. J. Cairns
R. Clarke
Characterisation, identification, clustering, and classification of disease
Scientific Reports
author_facet A. J. Webster
K. Gaitskell
I. Turnbull
B. J. Cairns
R. Clarke
author_sort A. J. Webster
title Characterisation, identification, clustering, and classification of disease
title_short Characterisation, identification, clustering, and classification of disease
title_full Characterisation, identification, clustering, and classification of disease
title_fullStr Characterisation, identification, clustering, and classification of disease
title_full_unstemmed Characterisation, identification, clustering, and classification of disease
title_sort characterisation, identification, clustering, and classification of disease
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2021-03-01
description Abstract The importance of quantifying the distribution and determinants of multimorbidity has prompted novel data-driven classifications of disease. Applications have included improved statistical power and refined prognoses for a range of respiratory, infectious, autoimmune, and neurological diseases, with studies using molecular information, age of disease incidence, and sequences of disease onset (“disease trajectories”) to classify disease clusters. Here we consider whether easily measured risk factors such as height and BMI can effectively characterise diseases in UK Biobank data, combining established statistical methods in new but rigorous ways to provide clinically relevant comparisons and clusters of disease. Over 400 common diseases were selected for analysis using clinical and epidemiological criteria, and conventional proportional hazards models were used to estimate associations with 12 established risk factors. Several diseases had strongly sex-dependent associations of disease risk with BMI. Importantly, a large proportion of diseases affecting both sexes could be identified by their risk factors, and equivalent diseases tended to cluster adjacently. These included 10 diseases presently classified as “Symptoms, signs, and abnormal clinical and laboratory findings, not elsewhere classified”. Many clusters are associated with a shared, known pathogenesis, others suggest likely but presently unconfirmed causes. The specificity of associations and shared pathogenesis of many clustered diseases provide a new perspective on the interactions between biological pathways, risk factors, and patterns of disease such as multimorbidity.
url https://doi.org/10.1038/s41598-021-84860-z
work_keys_str_mv AT ajwebster characterisationidentificationclusteringandclassificationofdisease
AT kgaitskell characterisationidentificationclusteringandclassificationofdisease
AT iturnbull characterisationidentificationclusteringandclassificationofdisease
AT bjcairns characterisationidentificationclusteringandclassificationofdisease
AT rclarke characterisationidentificationclusteringandclassificationofdisease
_version_ 1724224557875724288