Mutual information and variants for protein domain-domain contact prediction

<p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively em...

Full description

Bibliographic Details
Main Authors: Gomes Mireille, Hamer Rebecca, Reinert Gesine, Deane Charlotte M
Format: Article
Language:English
Published: BMC 2012-08-01
Series:BMC Research Notes
Online Access:http://www.biomedcentral.com/1756-0500/5/472
id doaj-c41abd65fe224cc78f1831b8a67d67c6
record_format Article
spelling doaj-c41abd65fe224cc78f1831b8a67d67c62020-11-25T02:12:50ZengBMCBMC Research Notes1756-05002012-08-015147210.1186/1756-0500-5-472Mutual information and variants for protein domain-domain contact predictionGomes MireilleHamer RebeccaReinert GesineDeane Charlotte M<p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively employed and modified to identify residues within a protein (intra-protein) that are in contact. More recently MI and its variants have also been used in the prediction of contacts between proteins (inter-protein).</p> <p>Methods</p> <p>Here we assess the predictive power of MI and variants for domain-domain contact prediction. We test original MI and these variants, which are called MIp, MIc and ZNMI, on 40 domain-domain test cases containing 10,753 sequences. We also propose and evaluate two new versions of MI that consider triangles of residues and the physiochemical properties of the amino acids, respectively.</p> <p>Results</p> <p>We found that all versions of MI are skewed towards predicting surface residues. Since domain-domain contacts are on the surface of each domain, we considered only surface residues when attempting to predict contacts. Our analysis shows that MIc is the best current MI domain-domain contact predictor. At 20% recall MIc achieved a precision of 44.9% when only surface residues were considered. Our triangle and reduced alphabet variants of MI highlight the delicate trade-off between signal and noise in the use of MI for domain-domain contact prediction. We also examine a specific “successful” case study and demonstrate that here, when considering surface residues, even the most accurate domain-domain contact predictor, MIc, performs no better than random.</p> <p>Conclusions</p> <p>All tested variants of MI are skewed towards predicting surface residues. When considering surface residues only, we find MIc to be the best current MI domain-domain contact predictor. Its performance, however, is not as good as a non-MI based contact predictor, i-Patch. Additionally, the intra-protein contact prediction capabilities of MIc outperform its domain-domain contact prediction abilities.</p> http://www.biomedcentral.com/1756-0500/5/472
collection DOAJ
language English
format Article
sources DOAJ
author Gomes Mireille
Hamer Rebecca
Reinert Gesine
Deane Charlotte M
spellingShingle Gomes Mireille
Hamer Rebecca
Reinert Gesine
Deane Charlotte M
Mutual information and variants for protein domain-domain contact prediction
BMC Research Notes
author_facet Gomes Mireille
Hamer Rebecca
Reinert Gesine
Deane Charlotte M
author_sort Gomes Mireille
title Mutual information and variants for protein domain-domain contact prediction
title_short Mutual information and variants for protein domain-domain contact prediction
title_full Mutual information and variants for protein domain-domain contact prediction
title_fullStr Mutual information and variants for protein domain-domain contact prediction
title_full_unstemmed Mutual information and variants for protein domain-domain contact prediction
title_sort mutual information and variants for protein domain-domain contact prediction
publisher BMC
series BMC Research Notes
issn 1756-0500
publishDate 2012-08-01
description <p>Abstract</p> <p>Background</p> <p>Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively employed and modified to identify residues within a protein (intra-protein) that are in contact. More recently MI and its variants have also been used in the prediction of contacts between proteins (inter-protein).</p> <p>Methods</p> <p>Here we assess the predictive power of MI and variants for domain-domain contact prediction. We test original MI and these variants, which are called MIp, MIc and ZNMI, on 40 domain-domain test cases containing 10,753 sequences. We also propose and evaluate two new versions of MI that consider triangles of residues and the physiochemical properties of the amino acids, respectively.</p> <p>Results</p> <p>We found that all versions of MI are skewed towards predicting surface residues. Since domain-domain contacts are on the surface of each domain, we considered only surface residues when attempting to predict contacts. Our analysis shows that MIc is the best current MI domain-domain contact predictor. At 20% recall MIc achieved a precision of 44.9% when only surface residues were considered. Our triangle and reduced alphabet variants of MI highlight the delicate trade-off between signal and noise in the use of MI for domain-domain contact prediction. We also examine a specific “successful” case study and demonstrate that here, when considering surface residues, even the most accurate domain-domain contact predictor, MIc, performs no better than random.</p> <p>Conclusions</p> <p>All tested variants of MI are skewed towards predicting surface residues. When considering surface residues only, we find MIc to be the best current MI domain-domain contact predictor. Its performance, however, is not as good as a non-MI based contact predictor, i-Patch. Additionally, the intra-protein contact prediction capabilities of MIc outperform its domain-domain contact prediction abilities.</p>
url http://www.biomedcentral.com/1756-0500/5/472
work_keys_str_mv AT gomesmireille mutualinformationandvariantsforproteindomaindomaincontactprediction
AT hamerrebecca mutualinformationandvariantsforproteindomaindomaincontactprediction
AT reinertgesine mutualinformationandvariantsforproteindomaindomaincontactprediction
AT deanecharlottem mutualinformationandvariantsforproteindomaindomaincontactprediction
_version_ 1724907988966178816