Mutual information and variants for protein domain-domain contact prediction
<p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively em...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2012-08-01
|
Series: | BMC Research Notes |
Online Access: | http://www.biomedcentral.com/1756-0500/5/472 |
id |
doaj-c41abd65fe224cc78f1831b8a67d67c6 |
---|---|
record_format |
Article |
spelling |
doaj-c41abd65fe224cc78f1831b8a67d67c62020-11-25T02:12:50ZengBMCBMC Research Notes1756-05002012-08-015147210.1186/1756-0500-5-472Mutual information and variants for protein domain-domain contact predictionGomes MireilleHamer RebeccaReinert GesineDeane Charlotte M<p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively employed and modified to identify residues within a protein (intra-protein) that are in contact. More recently MI and its variants have also been used in the prediction of contacts between proteins (inter-protein).</p> <p>Methods</p> <p>Here we assess the predictive power of MI and variants for domain-domain contact prediction. We test original MI and these variants, which are called MIp, MIc and ZNMI, on 40 domain-domain test cases containing 10,753 sequences. We also propose and evaluate two new versions of MI that consider triangles of residues and the physiochemical properties of the amino acids, respectively.</p> <p>Results</p> <p>We found that all versions of MI are skewed towards predicting surface residues. Since domain-domain contacts are on the surface of each domain, we considered only surface residues when attempting to predict contacts. Our analysis shows that MIc is the best current MI domain-domain contact predictor. At 20% recall MIc achieved a precision of 44.9% when only surface residues were considered. Our triangle and reduced alphabet variants of MI highlight the delicate trade-off between signal and noise in the use of MI for domain-domain contact prediction. We also examine a specific “successful” case study and demonstrate that here, when considering surface residues, even the most accurate domain-domain contact predictor, MIc, performs no better than random.</p> <p>Conclusions</p> <p>All tested variants of MI are skewed towards predicting surface residues. When considering surface residues only, we find MIc to be the best current MI domain-domain contact predictor. Its performance, however, is not as good as a non-MI based contact predictor, i-Patch. Additionally, the intra-protein contact prediction capabilities of MIc outperform its domain-domain contact prediction abilities.</p> http://www.biomedcentral.com/1756-0500/5/472 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Gomes Mireille Hamer Rebecca Reinert Gesine Deane Charlotte M |
spellingShingle |
Gomes Mireille Hamer Rebecca Reinert Gesine Deane Charlotte M Mutual information and variants for protein domain-domain contact prediction BMC Research Notes |
author_facet |
Gomes Mireille Hamer Rebecca Reinert Gesine Deane Charlotte M |
author_sort |
Gomes Mireille |
title |
Mutual information and variants for protein domain-domain contact prediction |
title_short |
Mutual information and variants for protein domain-domain contact prediction |
title_full |
Mutual information and variants for protein domain-domain contact prediction |
title_fullStr |
Mutual information and variants for protein domain-domain contact prediction |
title_full_unstemmed |
Mutual information and variants for protein domain-domain contact prediction |
title_sort |
mutual information and variants for protein domain-domain contact prediction |
publisher |
BMC |
series |
BMC Research Notes |
issn |
1756-0500 |
publishDate |
2012-08-01 |
description |
<p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively employed and modified to identify residues within a protein (intra-protein) that are in contact. More recently MI and its variants have also been used in the prediction of contacts between proteins (inter-protein).</p> <p>Methods</p> <p>Here we assess the predictive power of MI and variants for domain-domain contact prediction. We test original MI and these variants, which are called MIp, MIc and ZNMI, on 40 domain-domain test cases containing 10,753 sequences. We also propose and evaluate two new versions of MI that consider triangles of residues and the physiochemical properties of the amino acids, respectively.</p> <p>Results</p> <p>We found that all versions of MI are skewed towards predicting surface residues. Since domain-domain contacts are on the surface of each domain, we considered only surface residues when attempting to predict contacts. Our analysis shows that MIc is the best current MI domain-domain contact predictor. At 20% recall MIc achieved a precision of 44.9% when only surface residues were considered. Our triangle and reduced alphabet variants of MI highlight the delicate trade-off between signal and noise in the use of MI for domain-domain contact prediction. We also examine a specific “successful” case study and demonstrate that here, when considering surface residues, even the most accurate domain-domain contact predictor, MIc, performs no better than random.</p> <p>Conclusions</p> <p>All tested variants of MI are skewed towards predicting surface residues. When considering surface residues only, we find MIc to be the best current MI domain-domain contact predictor. Its performance, however, is not as good as a non-MI based contact predictor, i-Patch. Additionally, the intra-protein contact prediction capabilities of MIc outperform its domain-domain contact prediction abilities.</p> |
url |
http://www.biomedcentral.com/1756-0500/5/472 |
work_keys_str_mv |
AT gomesmireille mutualinformationandvariantsforproteindomaindomaincontactprediction AT hamerrebecca mutualinformationandvariantsforproteindomaindomaincontactprediction AT reinertgesine mutualinformationandvariantsforproteindomaindomaincontactprediction AT deanecharlottem mutualinformationandvariantsforproteindomaindomaincontactprediction |
_version_ |
1724907988966178816 |