Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters

Recently it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample increase of diagnostic accuracy of schizophrenia with number of subjects (N) has been shown, the relationship between N and accuracy is complete...

Full description

Bibliographic Details
Main Authors:	Hugo eSchnack, Rene eKahn
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2016-03-01
Series:	Frontiers in Psychiatry
Subjects:	Neuroimaging Schizophrenia machine learning heterogeneity classification and prediction effect size
Online Access:	http://journal.frontiersin.org/Journal/10.3389/fpsyt.2016.00050/full

id	doaj-ea014ca10a3e4fba8f6ca02cde92ad59
record_format	Article
spelling	doaj-ea014ca10a3e4fba8f6ca02cde92ad592020-11-24T23:10:00ZengFrontiers Media S.A.Frontiers in Psychiatry1664-06402016-03-01710.3389/fpsyt.2016.00050180068Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size MattersHugo eSchnack0Rene eKahn1Brain Center Rudolf Magnus, University Medical Center UtrechtBrain Center Rudolf Magnus, University Medical Center UtrechtRecently it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample increase of diagnostic accuracy of schizophrenia with number of subjects (N) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a meta-analysis of machine learning in imaging schizophrenia, we found that while low-N studies can reach 90% and higher accuracy, above N/2=50 the maximum accuracy achieved steadily drops to below 70% for N/2>150. We investigate the role N plays in the wide variability in accuracy results (63-97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria / recruit from large geographic areas. A schizophrenia prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients.In addition to heterogeneity, we investigate other factors influencing accuracy and introduce a machine learning effect size. We derive a simple model of how the different factors such as sample heterogeneity determine this effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this we argue that smaller-N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher-N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy.In conclusion, when comparing results from different machine learning studies, the sample sizes should be taken into account. To assess the generalizability of the models, validation of the prediction models should be tested in independent samples. The prediction of more complex measures such as outcome, which are expected to have an underlying pattern of more subtle brain abnormalities, will require large (multicenter) studies.http://journal.frontiersin.org/Journal/10.3389/fpsyt.2016.00050/fullNeuroimagingSchizophreniamachine learningheterogeneityclassification and predictioneffect size
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Hugo eSchnack Rene eKahn
spellingShingle	Hugo eSchnack Rene eKahn Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters Frontiers in Psychiatry Neuroimaging Schizophrenia machine learning heterogeneity classification and prediction effect size
author_facet	Hugo eSchnack Rene eKahn
author_sort	Hugo eSchnack
title	Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
title_short	Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
title_full	Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
title_fullStr	Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
title_full_unstemmed	Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters
title_sort	detecting neuroimaging biomarkers for psychiatric disorders: sample size matters
publisher	Frontiers Media S.A.
series	Frontiers in Psychiatry
issn	1664-0640
publishDate	2016-03-01
description	Recently it was suggested that much larger cohorts are needed to prove the diagnostic value of neuroimaging biomarkers in psychiatry. While within a sample increase of diagnostic accuracy of schizophrenia with number of subjects (N) has been shown, the relationship between N and accuracy is completely different between studies. Using data from a meta-analysis of machine learning in imaging schizophrenia, we found that while low-N studies can reach 90% and higher accuracy, above N/2=50 the maximum accuracy achieved steadily drops to below 70% for N/2>150. We investigate the role N plays in the wide variability in accuracy results (63-97%). We hypothesize that the underlying cause of the decrease in accuracy with increasing N is sample heterogeneity. While smaller studies more easily include a homogeneous group of subjects (strict inclusion criteria are easily met; subjects live close to study site), larger studies inevitably need to relax the criteria / recruit from large geographic areas. A schizophrenia prediction model based on a heterogeneous group of patients with presumably a heterogeneous pattern of structural or functional brain changes will not be able to capture the whole variety of changes, thus being limited to patterns shared by most patients.In addition to heterogeneity, we investigate other factors influencing accuracy and introduce a machine learning effect size. We derive a simple model of how the different factors such as sample heterogeneity determine this effect size, and explain the variation in prediction accuracies found from the literature, both in cross-validation and independent sample testing. From this we argue that smaller-N studies may reach high prediction accuracy at the cost of lower generalizability to other samples. Higher-N studies, on the other hand, will have more generalization power, but at the cost of lower accuracy.In conclusion, when comparing results from different machine learning studies, the sample sizes should be taken into account. To assess the generalizability of the models, validation of the prediction models should be tested in independent samples. The prediction of more complex measures such as outcome, which are expected to have an underlying pattern of more subtle brain abnormalities, will require large (multicenter) studies.
topic	Neuroimaging Schizophrenia machine learning heterogeneity classification and prediction effect size
url	http://journal.frontiersin.org/Journal/10.3389/fpsyt.2016.00050/full
work_keys_str_mv	AT hugoeschnack detectingneuroimagingbiomarkersforpsychiatricdisorderssamplesizematters AT reneekahn detectingneuroimagingbiomarkersforpsychiatricdisorderssamplesizematters
_version_	1725608658769805312

Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters

Similar Items