Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage

Abstract Background HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3...

Full description

Bibliographic Details
Main Authors:	Vaisman Iosif I, Masso Majid
Format:	Article
Language:	English
Published:	BMC 2010-10-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/11/494

id	doaj-2d340ffdc5634050bab335ac10e92190
record_format	Article
spelling	doaj-2d340ffdc5634050bab335ac10e921902020-11-24T22:23:22ZengBMCBMC Bioinformatics1471-21052010-10-0111149410.1186/1471-2105-11-494Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usageVaisman Iosif IMasso Majid<p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p> http://www.biomedcentral.com/1471-2105/11/494
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Vaisman Iosif I Masso Majid
spellingShingle	Vaisman Iosif I Masso Majid Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage BMC Bioinformatics
author_facet	Vaisman Iosif I Masso Majid
author_sort	Vaisman Iosif I
title	Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_short	Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_full	Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_fullStr	Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_full_unstemmed	Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_sort	accurate and efficient gp120 v3 loop structure based models for the determination of hiv-1 co-receptor usage
publisher	BMC
series	BMC Bioinformatics
issn	1471-2105
publishDate	2010-10-01
description	<p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p>
url	http://www.biomedcentral.com/1471-2105/11/494
work_keys_str_mv	AT vaismaniosifi accurateandefficientgp120v3loopstructurebasedmodelsforthedeterminationofhiv1coreceptorusage AT massomajid accurateandefficientgp120v3loopstructurebasedmodelsforthedeterminationofhiv1coreceptorusage
_version_	1725764651800592384

Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage

Similar Items