Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC.
Protein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest from the research community, since it is widely used in structural biology and drug discovery. With new PSC methods continuously emerging and no clear method of choice, Multi-Criteria Protei...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2018-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC6192565?pdf=render |
id |
doaj-cf52b690df08409fb71e6762f004bfc6 |
---|---|
record_format |
Article |
spelling |
doaj-cf52b690df08409fb71e6762f004bfc62020-11-25T00:08:49ZengPublic Library of Science (PLoS)PLoS ONE1932-62032018-01-011310e020458710.1371/journal.pone.0204587Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC.Anuj SharmaElias S ManolakosProtein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest from the research community, since it is widely used in structural biology and drug discovery. With new PSC methods continuously emerging and no clear method of choice, Multi-Criteria Protein Structure Comparison (MCPSC) is commonly employed to combine methods and generate consensus structural similarity scores. We present pyMCPSC, a Python based utility we developed to allow users to perform MCPSC efficiently, by exploiting the parallelism afforded by the multi-core CPUs of today's desktop computers. We show how pyMCPSC facilitates the analysis of similarities in protein domain datasets and how it can be extended to incorporate new PSC methods as they are becoming available. We exemplify the power of pyMCPSC using a case study based on the Proteus_300 dataset. Results generated using pyMCPSC show that MCPSC scores form a reliable basis for identifying the true classification of a domain, as evidenced both by the ROC analysis as well as the Nearest-Neighbor analysis. Structure similarity based "Phylogenetic Trees" representation generated by pyMCPSC provide insight into functional grouping within the dataset of domains. Furthermore, scatter plots generated by pyMCPSC show the existence of strong correlation between protein domains belonging to SCOP Class C and loose correlation between those of SCOP Class D. Such analyses and corresponding visualizations help users quickly gain insights about their datasets. The source code of pyMCPSC is available under the GPLv3.0 license through a GitHub repository (https://github.com/xulesc/pymcpsc).http://europepmc.org/articles/PMC6192565?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Anuj Sharma Elias S Manolakos |
spellingShingle |
Anuj Sharma Elias S Manolakos Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. PLoS ONE |
author_facet |
Anuj Sharma Elias S Manolakos |
author_sort |
Anuj Sharma |
title |
Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. |
title_short |
Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. |
title_full |
Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. |
title_fullStr |
Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. |
title_full_unstemmed |
Multi-criteria protein structure comparison and structural similarities analysis using pyMCPSC. |
title_sort |
multi-criteria protein structure comparison and structural similarities analysis using pymcpsc. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2018-01-01 |
description |
Protein Structure Comparison (PSC) is a well developed field of computational proteomics with active interest from the research community, since it is widely used in structural biology and drug discovery. With new PSC methods continuously emerging and no clear method of choice, Multi-Criteria Protein Structure Comparison (MCPSC) is commonly employed to combine methods and generate consensus structural similarity scores. We present pyMCPSC, a Python based utility we developed to allow users to perform MCPSC efficiently, by exploiting the parallelism afforded by the multi-core CPUs of today's desktop computers. We show how pyMCPSC facilitates the analysis of similarities in protein domain datasets and how it can be extended to incorporate new PSC methods as they are becoming available. We exemplify the power of pyMCPSC using a case study based on the Proteus_300 dataset. Results generated using pyMCPSC show that MCPSC scores form a reliable basis for identifying the true classification of a domain, as evidenced both by the ROC analysis as well as the Nearest-Neighbor analysis. Structure similarity based "Phylogenetic Trees" representation generated by pyMCPSC provide insight into functional grouping within the dataset of domains. Furthermore, scatter plots generated by pyMCPSC show the existence of strong correlation between protein domains belonging to SCOP Class C and loose correlation between those of SCOP Class D. Such analyses and corresponding visualizations help users quickly gain insights about their datasets. The source code of pyMCPSC is available under the GPLv3.0 license through a GitHub repository (https://github.com/xulesc/pymcpsc). |
url |
http://europepmc.org/articles/PMC6192565?pdf=render |
work_keys_str_mv |
AT anujsharma multicriteriaproteinstructurecomparisonandstructuralsimilaritiesanalysisusingpymcpsc AT eliassmanolakos multicriteriaproteinstructurecomparisonandstructuralsimilaritiesanalysisusingpymcpsc |
_version_ |
1725414437572050944 |