Simultaneous Alignment and Folding of Protein Sequences

Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We present partiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein se...

Full description

Bibliographic Details
Main Authors: O'Donnell, Charles William (Contributor), Will, Sebastian (Contributor), Devadas, Srinivas (Contributor), Backofen, Rolf (Author), Berger, Bonnie (Contributor), Waldispuhl, Jerome (Contributor)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor), Massachusetts Institute of Technology. Department of Mathematics (Contributor), Massachusetts Institute of Technology. Research Laboratory of Electronics (Contributor)
Format: Article
Language:English
Published: Mary Ann Liebert, 2015-11-23T16:59:17Z.
Subjects:
Online Access:Get fulltext
Description
Summary:Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We present partiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm's complexity is polynomial in time and space. Algorithmically, partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments, partiFold-Align significantly outperforms state-of-the-art pairwise and multiple sequence alignment tools in the most difficult low-sequence homology case. It also improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families (partiFold-Align is available at http://partifold.csail.mit.edu/).
National Institutes of Health (U.S.) (Grant R01GM081871)