Summary: | Radiology report writing in hospitals is a time-consuming task that also requires experience from the involved radiologists. This paper proposes a deep learning model to automatically generate radiology reports given a chest x-ray image from the public IU-Xray dataset. Our work consists of three stages: (1) Fine-tune a pre-trained Chexnet to predict specific tags from the image. (2) Calculate weighted semantic features from the predicted tag's pre-trained embeddings. (3) Condition a pre-trained GPT2 model on the visual and semantic features to generate the full medical reports. We analyze the generated reports using word-overlap metrics while also adding new meaningful semantic-based similarity metrics. The proposed model, which we call CDGPT2, surpassed most non-hierarchical recurrent models and transformer-based models in quantitative metrics while being considerably faster to train. Moreover, the model does not require a specific vocabulary and can be trained on different datasets without changing the architecture. Furthermore, we include a qualitative analysis from a radiologist from Egypt's national institute of cancer which showed that 61.6% of the generated reports on the test set were expertly written, and only 10% contained false information. We represent the first work to condition a pre-trained transformer on visual and semantic features to generate medical reports and to include semantic similarity metrics in the quantitative analysis of the generated reports.
|