Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage

<p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3...

Full description

Bibliographic Details
Main Authors: Vaisman Iosif I, Masso Majid
Format: Article
Language:English
Published: BMC 2010-10-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/11/494
id doaj-2d340ffdc5634050bab335ac10e92190
record_format Article
spelling doaj-2d340ffdc5634050bab335ac10e921902020-11-24T22:23:22ZengBMCBMC Bioinformatics1471-21052010-10-0111149410.1186/1471-2105-11-494Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usageVaisman Iosif IMasso Majid<p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p> http://www.biomedcentral.com/1471-2105/11/494
collection DOAJ
language English
format Article
sources DOAJ
author Vaisman Iosif I
Masso Majid
spellingShingle Vaisman Iosif I
Masso Majid
Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
BMC Bioinformatics
author_facet Vaisman Iosif I
Masso Majid
author_sort Vaisman Iosif I
title Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_short Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_full Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_fullStr Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_full_unstemmed Accurate and efficient gp120 V3 loop structure based models for the determination of HIV-1 co-receptor usage
title_sort accurate and efficient gp120 v3 loop structure based models for the determination of hiv-1 co-receptor usage
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2010-10-01
description <p>Abstract</p> <p>Background</p> <p>HIV-1 targets human cells expressing both the CD4 receptor, which binds the viral envelope glycoprotein gp120, as well as either the CCR5 (R5) or CXCR4 (X4) co-receptors, which interact primarily with the third hypervariable loop (V3 loop) of gp120. Determination of HIV-1 affinity for either the R5 or X4 co-receptor on host cells facilitates the inclusion of co-receptor antagonists as a part of patient treatment strategies. A dataset of 1193 distinct gp120 V3 loop peptide sequences (989 R5-utilizing, 204 X4-capable) is utilized to train predictive classifiers based on implementations of random forest, support vector machine, boosted decision tree, and neural network machine learning algorithms. An <it>in silico </it>mutagenesis procedure employing multibody statistical potentials, computational geometry, and threading of variant V3 sequences onto an experimental structure, is used to generate a feature vector representation for each variant whose components measure environmental perturbations at corresponding structural positions.</p> <p>Results</p> <p>Classifier performance is evaluated based on stratified 10-fold cross-validation, stratified dataset splits (2/3 training, 1/3 validation), and leave-one-out cross-validation. Best reported values of sensitivity (85%), specificity (100%), and precision (98%) for predicting X4-capable HIV-1 virus, overall accuracy (97%), Matthew's correlation coefficient (89%), balanced error rate (0.08), and ROC area (0.97) all reach critical thresholds, suggesting that the models outperform six other state-of-the-art methods and come closer to competing with phenotype assays.</p> <p>Conclusions</p> <p>The trained classifiers provide instantaneous and reliable predictions regarding HIV-1 co-receptor usage, requiring only translated V3 loop genotypes as input. Furthermore, the novelty of these computational mutagenesis based predictor attributes distinguishes the models as orthogonal and complementary to previous methods that utilize sequence, structure, and/or evolutionary information. The classifiers are available online at <url>http://proteins.gmu.edu/automute</url>.</p>
url http://www.biomedcentral.com/1471-2105/11/494
work_keys_str_mv AT vaismaniosifi accurateandefficientgp120v3loopstructurebasedmodelsforthedeterminationofhiv1coreceptorusage
AT massomajid accurateandefficientgp120v3loopstructurebasedmodelsforthedeterminationofhiv1coreceptorusage
_version_ 1725764651800592384