Summary: | Diabetic Retinopathy (DR) is a highly prevalent complication of diabetes mellitus, which causes lesions on the retina that affect vision which may lead to blindness if it is not detected and diagnosed early. Convolutional neural networks (CNN) are becoming the state-of-the-art approach for automatic detection of DR by using fundus images. The high-level features extracted by CNN are mostly utilised for the detection and classification of lesions on the retina. This high-level representation is capable of classifying different DR classes; however, more effective features for detecting the damages are needed. This paper proposes the multi-scale attention network (MSA-Net) for DR classification. The proposed approach applies the encoder network to embed the retina image in a high-level representational space, where the combination of mid and high-level features is used to enrich the representation. Then a multi-scale feature pyramid is included to describe the retinal structure in a different locality. Furthermore, to enhance the discriminative power of the feature representation a multi-scale attention mechanism is used on top of the high-level representation. The model is trained in a standard way using the cross-entropy loss to classify the DR severity level. In parallel as an auxiliary task, the model is trained using the weakly annotated data to detect healthy and non-healthy retina images. This surrogate task helps the model to enrich its discriminative power for distinguishing the non-healthy retina images. The proposed method when implemented has achieved outstanding results on two public datasets: EyePACS and APTOS.
|