Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data

Background: Gray and white matter volume difference and change are important imaging markers of pathology and disease progression in neurology and psychiatry. Such measures are usually estimated from tissue segmentation maps produced by publicly available image processing pipelines. However, the rel...

Full description

Bibliographic Details
Main Authors: Mahsa Dadar, Simon Duchesne
Format: Article
Language:English
Published: Elsevier 2020-08-01
Series:NeuroImage
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1053811920304146
id doaj-9f2e7817968e4573ba9d9da5e02930e2
record_format Article
spelling doaj-9f2e7817968e4573ba9d9da5e02930e22020-11-25T03:25:50ZengElsevierNeuroImage1095-95722020-08-01217116928Reliability assessment of tissue classification algorithms for multi-center and multi-scanner dataMahsa Dadar0Simon Duchesne1Corresponding author. CERVO Brain Research Centre, 2601 Chemin de la Canardière Québec, G1J 2G3, Canada.; Department of Radiology and Nuclear Medicine, Faculty of Medicine, Laval University, CanadaDepartment of Radiology and Nuclear Medicine, Faculty of Medicine, Laval University, CanadaBackground: Gray and white matter volume difference and change are important imaging markers of pathology and disease progression in neurology and psychiatry. Such measures are usually estimated from tissue segmentation maps produced by publicly available image processing pipelines. However, the reliability of the produced segmentations when using multi-center and multi-scanner data remains understudied. Here, we assess the robustness of six publicly available tissue classification pipelines across images acquired from different MR scanners and sites. Methods: We used 90 T1-weighted images of a single individual, scanned in 73 sessions across 27 different sites to assess the robustness of the tissue classification tools. Variability in Dice similarity index values and tissue volumes was assessed for Atropos, BISON, Classify_Clean, FAST, FreeSurfer, and SPM12. Results: BISON had the highest overall Dice coefficient for GM, followed by SPM12 and Atropos; while Atropos had the highest overall Dice coefficient for WM, followed by BISON and SPM12. BISON had the lowest overall variability in its volumetric estimates, followed by FreeSurfer, and SPM12. All methods also had significant differences between some of their estimates across different scanner manufacturers (e.g. BISON had significantly higher GM estimates and correspondingly lower WM estimates for GE scans compared to Philips and Siemens), and different signal-to-noise ratio (SNR) levels (e.g. FAST and FreeSurfer had significantly higher WM volume estimates for high versus medium and low SNR tertiles as well as correspondingly lower GM volume estimates). Conclusions: Our comparisons provide a benchmark on the reliability of the publicly used tissue classification techniques and the amount of variability that can be expected when using large multi-center and multi-scanner databases.http://www.sciencedirect.com/science/article/pii/S1053811920304146ReliabilityMulti-centerMulti-scannertissue classification
collection DOAJ
language English
format Article
sources DOAJ
author Mahsa Dadar
Simon Duchesne
spellingShingle Mahsa Dadar
Simon Duchesne
Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
NeuroImage
Reliability
Multi-center
Multi-scanner
tissue classification
author_facet Mahsa Dadar
Simon Duchesne
author_sort Mahsa Dadar
title Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
title_short Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
title_full Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
title_fullStr Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
title_full_unstemmed Reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
title_sort reliability assessment of tissue classification algorithms for multi-center and multi-scanner data
publisher Elsevier
series NeuroImage
issn 1095-9572
publishDate 2020-08-01
description Background: Gray and white matter volume difference and change are important imaging markers of pathology and disease progression in neurology and psychiatry. Such measures are usually estimated from tissue segmentation maps produced by publicly available image processing pipelines. However, the reliability of the produced segmentations when using multi-center and multi-scanner data remains understudied. Here, we assess the robustness of six publicly available tissue classification pipelines across images acquired from different MR scanners and sites. Methods: We used 90 T1-weighted images of a single individual, scanned in 73 sessions across 27 different sites to assess the robustness of the tissue classification tools. Variability in Dice similarity index values and tissue volumes was assessed for Atropos, BISON, Classify_Clean, FAST, FreeSurfer, and SPM12. Results: BISON had the highest overall Dice coefficient for GM, followed by SPM12 and Atropos; while Atropos had the highest overall Dice coefficient for WM, followed by BISON and SPM12. BISON had the lowest overall variability in its volumetric estimates, followed by FreeSurfer, and SPM12. All methods also had significant differences between some of their estimates across different scanner manufacturers (e.g. BISON had significantly higher GM estimates and correspondingly lower WM estimates for GE scans compared to Philips and Siemens), and different signal-to-noise ratio (SNR) levels (e.g. FAST and FreeSurfer had significantly higher WM volume estimates for high versus medium and low SNR tertiles as well as correspondingly lower GM volume estimates). Conclusions: Our comparisons provide a benchmark on the reliability of the publicly used tissue classification techniques and the amount of variability that can be expected when using large multi-center and multi-scanner databases.
topic Reliability
Multi-center
Multi-scanner
tissue classification
url http://www.sciencedirect.com/science/article/pii/S1053811920304146
work_keys_str_mv AT mahsadadar reliabilityassessmentoftissueclassificationalgorithmsformulticenterandmultiscannerdata
AT simonduchesne reliabilityassessmentoftissueclassificationalgorithmsformulticenterandmultiscannerdata
_version_ 1724595342722203648