Summary: | In dysphagia, food materials frequently invade the laryngeal airway, potentially resulting in serious consequences, such as asphyxia or pneumonia. The VFSS (videofluoroscopic swallowing study) procedure can be used to visualize the occurrence of airway invasion, but its reliability is limited by human errors and fatigue. Deep learning technology may improve the efficiency and reliability of VFSS analysis by reducing the human effort required. A deep learning model has been developed that can detect airway invasion from VFSS images in a fully automated manner. The model consists of three phases: (1) image normalization, (2) dynamic ROI (region of interest) determination, and (3) airway invasion detection. Noise induced by movement and learning from unintended areas is minimized by defining a “dynamic” ROI with respect to the center of the cervical spinal column as segmented using U-Net. An Xception module, trained on a dataset consisting of 267,748 image frames obtained from 319 VFSS video files, is used for the detection of airway invasion. The present model shows an overall accuracy of 97.2% in classifying image frames and 93.2% in classifying video files. It is anticipated that the present model will enable more accurate analysis of VFSS data.
|