Performance of AI-Based Automated Classifications of Whole-Body FDG PET in Clinical Practice: The CLARITI Project

Purpose: To assess the feasibility of a three-dimensional deep convolutional neural network (3D-CNN) for the general triage of whole-body FDG PET in daily clinical practice. Methods: An institutional clinical data warehouse working environment was devoted to this PET imaging purpose. Dedicated reque...

Full description

Bibliographic Details
Main Authors: Berenbaum, A. (Author), Besson, F.L (Author), Bréant, S. (Author), Daniel, C. (Author), Delingette, H. (Author), Durand, E. (Author), Frank, M. (Author), Grimaldi, L. (Author), Hassen-Khodja, C. (Author), Maire, A. (Author), Martel, P. (Author), Poret, C. (Author)
Format: Article
Language:English
Published: MDPI 2023
Subjects:
Online Access:View Fulltext in Publisher
View in Scopus
Description
Summary:Purpose: To assess the feasibility of a three-dimensional deep convolutional neural network (3D-CNN) for the general triage of whole-body FDG PET in daily clinical practice. Methods: An institutional clinical data warehouse working environment was devoted to this PET imaging purpose. Dedicated request procedures and data processing workflows were specifically developed within this infrastructure and applied retrospectively to a monocentric dataset as a proof of concept. A custom-made 3D-CNN was first trained and tested on an “unambiguous” well-balanced data sample, which included strictly normal and highly pathological scans. For the training phase, 90% of the data sample was used (learning set: 80%; validation set: 20%, 5-fold cross validation) and the remaining 10% constituted the test set. Finally, the model was applied to a “real-life” test set which included any scans taken. Text mining of the PET reports systematically combined with visual rechecking by an experienced reader served as the standard-of-truth for PET labeling. Results: From 8125 scans, 4963 PETs had processable cross-matched medical reports. For the “unambiguous” dataset (1084 PETs), the 3D-CNN’s overall results for sensitivity, specificity, positive and negative predictive values and likelihood ratios were 84%, 98%, 98%, 85%, 42.0 and 0.16, respectively (F1 score of 90%). When applied to the “real-life” dataset (4963 PETs), the sensitivity, NPV, LR+, LR− and F1 score substantially decreased (61%, 40%, 2.97, 0.49 and 73%, respectively), whereas the specificity and PPV remained high (79% and 90%). Conclusion: An AI-based triage of whole-body FDG PET is promising. Further studies are needed to overcome the challenges presented by the imperfection of real-life PET data. © 2023 by the authors.
ISBN:20763417 (ISSN)
DOI:10.3390/app13095281