Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos

Background Colorectal cancer (CRC) is a major public health burden worldwide, and colonoscopy is the most commonly used CRC screening tool. Still, there is variability in adenoma detection rate (ADR) among endoscopists. Recent studies have reported improved ADR using deep learning models trained on...

Full description

Bibliographic Details
Main Authors: Taibo Li, Jeremy R. Glissen Brown, Kelovoulos Tsourides, Nadim Mahmud, Jonah M. Cohen, Tyler M. Berzin
Format: Article
Language:English
Published: Georg Thieme Verlag KG 2020-10-01
Series:Endoscopy International Open
Online Access:http://www.thieme-connect.de/DOI/DOI?10.1055/a-1229-3927
id doaj-3e089d64ff3245cfa66d38098e7921f2
record_format Article
spelling doaj-3e089d64ff3245cfa66d38098e7921f22020-11-25T04:00:32ZengGeorg Thieme Verlag KGEndoscopy International Open2364-37222196-97362020-10-010810E1448E145410.1055/a-1229-3927Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videosTaibo Li0Jeremy R. Glissen Brown1Kelovoulos Tsourides2Nadim Mahmud3Jonah M. Cohen4Tyler M. Berzin5Johns Hopkins School of Medicine – MD-PhD Program, Baltimore, Maryland, United StatesCenter for Advanced Endoscopy, Division of Gastroenterology, Beth Israel Deaconess, Medical Center and Harvard Medical School, Boston, Massachusetts 02130MIT – Department of Brain and Cognitive Sciences, Cambridge, Massachusetts, United StatesHospital of the University of Pennsylvania – Division of Gastroenterology, Boston, Massachusetts, United StatesCenter for Advanced Endoscopy, Division of Gastroenterology, Beth Israel Deaconess, Medical Center and Harvard Medical School, Boston, Massachusetts 02130Center for Advanced Endoscopy, Division of Gastroenterology, Beth Israel Deaconess, Medical Center and Harvard Medical School, Boston, Massachusetts 02130Background Colorectal cancer (CRC) is a major public health burden worldwide, and colonoscopy is the most commonly used CRC screening tool. Still, there is variability in adenoma detection rate (ADR) among endoscopists. Recent studies have reported improved ADR using deep learning models trained on videos curated largely from private in-house datasets. Few have focused on the detection of sessile serrated adenomas (SSAs), which are the most challenging target clinically. Methods We identified 23 colonoscopy videos available in the public domain and for which pathology data were provided, totaling 390 minutes of footage. Expert endoscopists annotated segments of video with adenomatous polyps, from which we captured 509 polyp-positive and 6,875 polyp-free frames. Via data augmentation, we generated 15,270 adenomatous polyp-positive images, of which 2,310 were SSAs, and 20,625 polyp-negative images. We used the CNN AlexNet and fine-tuned its parameters using 90 % of the images, before testing its performance on the remaining 10 % of images unseen by the model. Results We trained the model on 32,305 images and tested performance on 3,590 images with the same proportion of SSA, non-SSA polyp-positive, and polyp-negative images. The overall accuracy of the model was 0.86, with a sensitivity of 0.73 and a specificity of 0.96. Positive predictive value was 0.93 and negative predictive value was 0.96. The area under the curve was 0.94. SSAs were detected in 93 % of SSA-positive images. Conclusions Using a relatively small set of publicly-available colonoscopy data, we obtained sizable training and validation sets of endoscopic images using data augmentation, and achieved an excellent performance in adenomatous polyp detection.http://www.thieme-connect.de/DOI/DOI?10.1055/a-1229-3927
collection DOAJ
language English
format Article
sources DOAJ
author Taibo Li
Jeremy R. Glissen Brown
Kelovoulos Tsourides
Nadim Mahmud
Jonah M. Cohen
Tyler M. Berzin
spellingShingle Taibo Li
Jeremy R. Glissen Brown
Kelovoulos Tsourides
Nadim Mahmud
Jonah M. Cohen
Tyler M. Berzin
Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
Endoscopy International Open
author_facet Taibo Li
Jeremy R. Glissen Brown
Kelovoulos Tsourides
Nadim Mahmud
Jonah M. Cohen
Tyler M. Berzin
author_sort Taibo Li
title Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
title_short Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
title_full Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
title_fullStr Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
title_full_unstemmed Training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
title_sort training a computer-aided polyp detection system to detect sessile serrated adenomas using public domain colonoscopy videos
publisher Georg Thieme Verlag KG
series Endoscopy International Open
issn 2364-3722
2196-9736
publishDate 2020-10-01
description Background Colorectal cancer (CRC) is a major public health burden worldwide, and colonoscopy is the most commonly used CRC screening tool. Still, there is variability in adenoma detection rate (ADR) among endoscopists. Recent studies have reported improved ADR using deep learning models trained on videos curated largely from private in-house datasets. Few have focused on the detection of sessile serrated adenomas (SSAs), which are the most challenging target clinically. Methods We identified 23 colonoscopy videos available in the public domain and for which pathology data were provided, totaling 390 minutes of footage. Expert endoscopists annotated segments of video with adenomatous polyps, from which we captured 509 polyp-positive and 6,875 polyp-free frames. Via data augmentation, we generated 15,270 adenomatous polyp-positive images, of which 2,310 were SSAs, and 20,625 polyp-negative images. We used the CNN AlexNet and fine-tuned its parameters using 90 % of the images, before testing its performance on the remaining 10 % of images unseen by the model. Results We trained the model on 32,305 images and tested performance on 3,590 images with the same proportion of SSA, non-SSA polyp-positive, and polyp-negative images. The overall accuracy of the model was 0.86, with a sensitivity of 0.73 and a specificity of 0.96. Positive predictive value was 0.93 and negative predictive value was 0.96. The area under the curve was 0.94. SSAs were detected in 93 % of SSA-positive images. Conclusions Using a relatively small set of publicly-available colonoscopy data, we obtained sizable training and validation sets of endoscopic images using data augmentation, and achieved an excellent performance in adenomatous polyp detection.
url http://www.thieme-connect.de/DOI/DOI?10.1055/a-1229-3927
work_keys_str_mv AT taiboli trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
AT jeremyrglissenbrown trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
AT kelovoulostsourides trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
AT nadimmahmud trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
AT jonahmcohen trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
AT tylermberzin trainingacomputeraidedpolypdetectionsystemtodetectsessileserratedadenomasusingpublicdomaincolonoscopyvideos
_version_ 1724449992064630784