Automated radiology report generation using conditioned transformers

Radiology report writing in hospitals is a time-consuming task that also requires experience from the involved radiologists. This paper proposes a deep learning model to automatically generate radiology reports given a chest x-ray image from the public IU-Xray dataset. Our work consists of three sta...

Full description

Bibliographic Details
Main Authors: Omar Alfarghaly, Rana Khaled, Abeer Elkorany, Maha Helal, Aly Fahmy
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914821000472
id doaj-84674192c4d149059506d274bf996c28
record_format Article
spelling doaj-84674192c4d149059506d274bf996c282021-06-19T04:54:57ZengElsevierInformatics in Medicine Unlocked2352-91482021-01-0124100557Automated radiology report generation using conditioned transformersOmar Alfarghaly0Rana Khaled1Abeer Elkorany2Maha Helal3Aly Fahmy4Computers and Artificial Intelligence, Cairo University, Cairo, Egypt; Corresponding author. Giza Governorate, 12613, Egypt.National Institute of Cancer, Cairo University, Cairo, EgyptComputers and Artificial Intelligence, Cairo University, Cairo, EgyptNational Institute of Cancer, Cairo University, Cairo, EgyptComputers and Artificial Intelligence, Cairo University, Cairo, EgyptRadiology report writing in hospitals is a time-consuming task that also requires experience from the involved radiologists. This paper proposes a deep learning model to automatically generate radiology reports given a chest x-ray image from the public IU-Xray dataset. Our work consists of three stages: (1) Fine-tune a pre-trained Chexnet to predict specific tags from the image. (2) Calculate weighted semantic features from the predicted tag's pre-trained embeddings. (3) Condition a pre-trained GPT2 model on the visual and semantic features to generate the full medical reports. We analyze the generated reports using word-overlap metrics while also adding new meaningful semantic-based similarity metrics. The proposed model, which we call CDGPT2, surpassed most non-hierarchical recurrent models and transformer-based models in quantitative metrics while being considerably faster to train. Moreover, the model does not require a specific vocabulary and can be trained on different datasets without changing the architecture. Furthermore, we include a qualitative analysis from a radiologist from Egypt's national institute of cancer which showed that 61.6% of the generated reports on the test set were expertly written, and only 10% contained false information. We represent the first work to condition a pre-trained transformer on visual and semantic features to generate medical reports and to include semantic similarity metrics in the quantitative analysis of the generated reports.http://www.sciencedirect.com/science/article/pii/S2352914821000472Report generationTransformersGPT2Transfer learningX-rayDeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Omar Alfarghaly
Rana Khaled
Abeer Elkorany
Maha Helal
Aly Fahmy
spellingShingle Omar Alfarghaly
Rana Khaled
Abeer Elkorany
Maha Helal
Aly Fahmy
Automated radiology report generation using conditioned transformers
Informatics in Medicine Unlocked
Report generation
Transformers
GPT2
Transfer learning
X-ray
Deep learning
author_facet Omar Alfarghaly
Rana Khaled
Abeer Elkorany
Maha Helal
Aly Fahmy
author_sort Omar Alfarghaly
title Automated radiology report generation using conditioned transformers
title_short Automated radiology report generation using conditioned transformers
title_full Automated radiology report generation using conditioned transformers
title_fullStr Automated radiology report generation using conditioned transformers
title_full_unstemmed Automated radiology report generation using conditioned transformers
title_sort automated radiology report generation using conditioned transformers
publisher Elsevier
series Informatics in Medicine Unlocked
issn 2352-9148
publishDate 2021-01-01
description Radiology report writing in hospitals is a time-consuming task that also requires experience from the involved radiologists. This paper proposes a deep learning model to automatically generate radiology reports given a chest x-ray image from the public IU-Xray dataset. Our work consists of three stages: (1) Fine-tune a pre-trained Chexnet to predict specific tags from the image. (2) Calculate weighted semantic features from the predicted tag's pre-trained embeddings. (3) Condition a pre-trained GPT2 model on the visual and semantic features to generate the full medical reports. We analyze the generated reports using word-overlap metrics while also adding new meaningful semantic-based similarity metrics. The proposed model, which we call CDGPT2, surpassed most non-hierarchical recurrent models and transformer-based models in quantitative metrics while being considerably faster to train. Moreover, the model does not require a specific vocabulary and can be trained on different datasets without changing the architecture. Furthermore, we include a qualitative analysis from a radiologist from Egypt's national institute of cancer which showed that 61.6% of the generated reports on the test set were expertly written, and only 10% contained false information. We represent the first work to condition a pre-trained transformer on visual and semantic features to generate medical reports and to include semantic similarity metrics in the quantitative analysis of the generated reports.
topic Report generation
Transformers
GPT2
Transfer learning
X-ray
Deep learning
url http://www.sciencedirect.com/science/article/pii/S2352914821000472
work_keys_str_mv AT omaralfarghaly automatedradiologyreportgenerationusingconditionedtransformers
AT ranakhaled automatedradiologyreportgenerationusingconditionedtransformers
AT abeerelkorany automatedradiologyreportgenerationusingconditionedtransformers
AT mahahelal automatedradiologyreportgenerationusingconditionedtransformers
AT alyfahmy automatedradiologyreportgenerationusingconditionedtransformers
_version_ 1721371763068108800