Summary: | INTRODUCTION: Voxel-based lesion-symptom mapping (VLSM) is conventionally performed using skill and knowledge of experts to manually delineate brain lesions. This process requires time, and is likely to have substantial inter-rater variability. Here, we propose a supervised machine learning framework for lesion segmentation capable of learning from a single modality and existing manual segmentations in order to delineate lesions in new patients.
METHODS: Data from 60 patients with chronic stroke aphasia were utilized in the study (age: 59.7±11.5yrs, post-stroke interval: 5±2.9yrs, male/female ratio: 34/26). Using a single T1 image of each subject, additional features were created that provided complementary information, such as, difference from template, tissue segmentation, brain asymmetries, gradient magnitude, and deviances of these images from 80 age and gender matched controls. These features were fed into MRV-NRF (multi-resolution voxel-wise neighborhood random forest; Tustison et al., 2014) prediction algorithm implemented in ANTsR (Avants, 2015). The algorithm incorporates information from each voxel and its surrounding neighbors from all above features, in a hierarchy of random forest predictions from low to high resolution. The validity of the framework was tested with a 6-fold cross validation (i.e., train from 50 subjects, predict 10). The process was repeated ten times, producing ten segmentations for each subject, from which the average solution was binarized. Predicted lesions were compared to manually defined lesions, and VLSM models were built on 4 language measures: repetition and comprehension subscores from the WAB (Kertesz, 1982), WAB-AQ, and PNT naming accuracy (Roach, Schwartz, Martin, Grewal, & Brecher, 1996).
RESULTS: Manual and predicted lesion size showed high correlation (r=0.96). Compared to manual lesions, the predicted lesions had a dice overlap of 0.72 (±0.14 STD), a case-wise maximum distance (Hausdorff) of 21mm (±16.4), and area under the ROC curve of 0.86 (±0.09). Lesion size correlated with overlap (r=0.5, p<0.001), but not with maximum displacement (r=-15, p=0.27). VLSM thresholded t-maps (p<0.05, FDR corrected) showed a continuous dice overlap of 0.75 for AQ, 0.81 for repetition, 0.57 for comprehension, and 0.58 for naming (Figure 1). To investigate whether the mismatch between manual VLSM and automated VLSM involved critical areas related to cognitive performance, we created behavioral predictions from the VLSM models. Briefly, a prediction value was obtained from each voxel and the weighted average of all voxels was computed (i.e., voxels with high t-value contributed more to the prediction than voxels with low t-value). Manual VLSM showed slightly higher correlation of predicted performance with actual performance compared to automated VLSM (respectively, AQ: 0.65 and 0.60, repetition: 0.62 and 0.57, comprehension: 0.53 and 0.48, naming: 0.46 and 0.41). The difference between the two, however, was not significant (lowest p=0.07).
CONCLUSIONS: These findings show that automated lesion segmentation is a viable alternative to manual delineation, producing similar lesion-symptom maps and similar predictions with standard manual segmentations. Given the ability to learn from existing manual delineations, the tool can be implemented in ongoing projects either to fully automatize lesion segmentation, or to provide a preliminary delineation to be rectified by the expert.
|