Deep Learning Prediction of Cancer Prevalence from Satellite Imagery

The worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) features extracted from satell...

Full description

Bibliographic Details
Main Authors: Jean-Emmanuel Bibault, Maxime Bassenne, Hongyi Ren, Lei Xing
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Cancers
Subjects:
Online Access:https://www.mdpi.com/2072-6694/12/12/3844
id doaj-a134076bd7004395a1e2a3d77d22658e
record_format Article
spelling doaj-a134076bd7004395a1e2a3d77d22658e2020-12-20T00:03:01ZengMDPI AGCancers2072-66942020-12-01123844384410.3390/cancers12123844Deep Learning Prediction of Cancer Prevalence from Satellite ImageryJean-Emmanuel Bibault0Maxime Bassenne1Hongyi Ren2Lei Xing3Laboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University School of Medicine, Stanford, CA 94304, USALaboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University School of Medicine, Stanford, CA 94304, USALaboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University School of Medicine, Stanford, CA 94304, USALaboratory of Artificial Intelligence in Medicine and Biomedical Physics, Stanford University School of Medicine, Stanford, CA 94304, USAThe worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) features extracted from satellite images to predict cancer prevalence at the census tract level in seven cities in the United States. We trained the model using detailed cancer prevalence estimates from 2018 available in the CDC (Center for Disease Control) 500 Cities project. Data from 3500 census tracts covering 14,483,366 inhabitants were included. Features were extracted from 170,210 satellite images with deep learning. This method explained up to 64.37% (median = 43.53%) of the variation of cancer prevalence. Satellite features are highly correlated with individual socioeconomic and health measures that are linked to cancer prevalence (age, smoking and drinking status, and obesity). A higher similarity between two environments is associated with better generalization of the model (<i>p</i> = 1.10–6). This method can be used to accurately estimate cancer prevalence at a high spatial resolution without using surveys at a fraction of the cost.https://www.mdpi.com/2072-6694/12/12/3844cancerepidemiologydeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Jean-Emmanuel Bibault
Maxime Bassenne
Hongyi Ren
Lei Xing
spellingShingle Jean-Emmanuel Bibault
Maxime Bassenne
Hongyi Ren
Lei Xing
Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
Cancers
cancer
epidemiology
deep learning
author_facet Jean-Emmanuel Bibault
Maxime Bassenne
Hongyi Ren
Lei Xing
author_sort Jean-Emmanuel Bibault
title Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
title_short Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
title_full Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
title_fullStr Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
title_full_unstemmed Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
title_sort deep learning prediction of cancer prevalence from satellite imagery
publisher MDPI AG
series Cancers
issn 2072-6694
publishDate 2020-12-01
description The worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) features extracted from satellite images to predict cancer prevalence at the census tract level in seven cities in the United States. We trained the model using detailed cancer prevalence estimates from 2018 available in the CDC (Center for Disease Control) 500 Cities project. Data from 3500 census tracts covering 14,483,366 inhabitants were included. Features were extracted from 170,210 satellite images with deep learning. This method explained up to 64.37% (median = 43.53%) of the variation of cancer prevalence. Satellite features are highly correlated with individual socioeconomic and health measures that are linked to cancer prevalence (age, smoking and drinking status, and obesity). A higher similarity between two environments is associated with better generalization of the model (<i>p</i> = 1.10–6). This method can be used to accurately estimate cancer prevalence at a high spatial resolution without using surveys at a fraction of the cost.
topic cancer
epidemiology
deep learning
url https://www.mdpi.com/2072-6694/12/12/3844
work_keys_str_mv AT jeanemmanuelbibault deeplearningpredictionofcancerprevalencefromsatelliteimagery
AT maximebassenne deeplearningpredictionofcancerprevalencefromsatelliteimagery
AT hongyiren deeplearningpredictionofcancerprevalencefromsatelliteimagery
AT leixing deeplearningpredictionofcancerprevalencefromsatelliteimagery
_version_ 1724377314205106176