Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics

Abstract Background Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. It is vital that we better understand their accuracies and limitations because published perfo...

Full description

Bibliographic Details
Main Authors:	Khalid Mahmood, Chol-hee Jung, Gayle Philip, Peter Georgeson, Jessica Chung, Bernard J. Pope, Daniel J. Park
Format:	Article
Language:	English
Published:	BMC 2017-05-01
Series:	Human Genomics
Subjects:	Variant effect prediction Functional datasets Benchmarking Mutation assessment Pathogenicity prediction Protein function
Online Access:	http://link.springer.com/article/10.1186/s40246-017-0104-8

id	doaj-7e7690a42860424cb6876efa0369d02c
record_format	Article
spelling	doaj-7e7690a42860424cb6876efa0369d02c2020-11-25T00:26:20ZengBMCHuman Genomics1479-73642017-05-011111810.1186/s40246-017-0104-8Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnosticsKhalid Mahmood0Chol-hee Jung1Gayle Philip2Peter Georgeson3Jessica Chung4Bernard J. Pope5Daniel J. Park6Melbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneMelbourne Bioinformatics, The University of MelbourneAbstract Background Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. It is vital that we better understand their accuracies and limitations because published performance metrics are confounded by serious problems of circularity and error propagation. Here, we derive three independent, functionally determined human mutation datasets, UniFun, BRCA1-DMS and TP53-TA, and employ them, alongside previously described datasets, to assess the pre-eminent variant effect prediction tools. Results Apparent accuracies of variant effect prediction tools were influenced significantly by the benchmarking dataset. Benchmarking with the assay-determined datasets UniFun and BRCA1-DMS yielded areas under the receiver operating characteristic curves in the modest ranges of 0.52 to 0.63 and 0.54 to 0.75, respectively, considerably lower than observed for other, potentially more conflicted datasets. Conclusions These results raise concerns about how such algorithms should be employed, particularly in a clinical setting. Contemporary variant effect prediction tools are unlikely to be as accurate at the general prediction of functional impacts on proteins as reported prior. Use of functional assay-based datasets that avoid prior dependencies promises to be valuable for the ongoing development and accurate benchmarking of such tools.http://link.springer.com/article/10.1186/s40246-017-0104-8Variant effect predictionFunctional datasetsBenchmarkingMutation assessmentPathogenicity predictionProtein function
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Khalid Mahmood Chol-hee Jung Gayle Philip Peter Georgeson Jessica Chung Bernard J. Pope Daniel J. Park
spellingShingle	Khalid Mahmood Chol-hee Jung Gayle Philip Peter Georgeson Jessica Chung Bernard J. Pope Daniel J. Park Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics Human Genomics Variant effect prediction Functional datasets Benchmarking Mutation assessment Pathogenicity prediction Protein function
author_facet	Khalid Mahmood Chol-hee Jung Gayle Philip Peter Georgeson Jessica Chung Bernard J. Pope Daniel J. Park
author_sort	Khalid Mahmood
title	Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
title_short	Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
title_full	Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
title_fullStr	Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
title_full_unstemmed	Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
title_sort	variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics
publisher	BMC
series	Human Genomics
issn	1479-7364
publishDate	2017-05-01
description	Abstract Background Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. It is vital that we better understand their accuracies and limitations because published performance metrics are confounded by serious problems of circularity and error propagation. Here, we derive three independent, functionally determined human mutation datasets, UniFun, BRCA1-DMS and TP53-TA, and employ them, alongside previously described datasets, to assess the pre-eminent variant effect prediction tools. Results Apparent accuracies of variant effect prediction tools were influenced significantly by the benchmarking dataset. Benchmarking with the assay-determined datasets UniFun and BRCA1-DMS yielded areas under the receiver operating characteristic curves in the modest ranges of 0.52 to 0.63 and 0.54 to 0.75, respectively, considerably lower than observed for other, potentially more conflicted datasets. Conclusions These results raise concerns about how such algorithms should be employed, particularly in a clinical setting. Contemporary variant effect prediction tools are unlikely to be as accurate at the general prediction of functional impacts on proteins as reported prior. Use of functional assay-based datasets that avoid prior dependencies promises to be valuable for the ongoing development and accurate benchmarking of such tools.
topic	Variant effect prediction Functional datasets Benchmarking Mutation assessment Pathogenicity prediction Protein function
url	http://link.springer.com/article/10.1186/s40246-017-0104-8
work_keys_str_mv	AT khalidmahmood varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT cholheejung varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT gaylephilip varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT petergeorgeson varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT jessicachung varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT bernardjpope varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics AT danieljpark varianteffectpredictiontoolsassessedusingindependentfunctionalassaybaseddatasetsimplicationsfordiscoveryanddiagnostics
_version_	1725344686061649920

Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics

Similar Items