CNNcon: improved protein contact maps prediction using cascaded neural networks.

BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possib...

Full description

Bibliographic Details
Main Authors: Wang Ding, Jiang Xie, Dongbo Dai, Huiran Zhang, Hao Xie, Wu Zhang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3634008?pdf=render
id doaj-09072d06cfce4a8d98c58d8ae647f0ce
record_format Article
spelling doaj-09072d06cfce4a8d98c58d8ae647f0ce2020-11-24T21:57:28ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0184e6153310.1371/journal.pone.0061533CNNcon: improved protein contact maps prediction using cascaded neural networks.Wang DingJiang XieDongbo DaiHuiran ZhangHao XieWu ZhangBACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. METHODS: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. RESULTS: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.http://europepmc.org/articles/PMC3634008?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Wang Ding
Jiang Xie
Dongbo Dai
Huiran Zhang
Hao Xie
Wu Zhang
spellingShingle Wang Ding
Jiang Xie
Dongbo Dai
Huiran Zhang
Hao Xie
Wu Zhang
CNNcon: improved protein contact maps prediction using cascaded neural networks.
PLoS ONE
author_facet Wang Ding
Jiang Xie
Dongbo Dai
Huiran Zhang
Hao Xie
Wu Zhang
author_sort Wang Ding
title CNNcon: improved protein contact maps prediction using cascaded neural networks.
title_short CNNcon: improved protein contact maps prediction using cascaded neural networks.
title_full CNNcon: improved protein contact maps prediction using cascaded neural networks.
title_fullStr CNNcon: improved protein contact maps prediction using cascaded neural networks.
title_full_unstemmed CNNcon: improved protein contact maps prediction using cascaded neural networks.
title_sort cnncon: improved protein contact maps prediction using cascaded neural networks.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2013-01-01
description BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly. METHODS: CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction. RESULTS: The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.
url http://europepmc.org/articles/PMC3634008?pdf=render
work_keys_str_mv AT wangding cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
AT jiangxie cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
AT dongbodai cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
AT huiranzhang cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
AT haoxie cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
AT wuzhang cnnconimprovedproteincontactmapspredictionusingcascadedneuralnetworks
_version_ 1725855421859627008