Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation

Abstract Background Resolution estimation is the main evaluation criteria for the reconstruction of macromolecular 3D structure in the field of cryoelectron microscopy (cryo-EM). At present, there are many methods to evaluate the 3D resolution for reconstructed macromolecular structures from Single...

Full description

Bibliographic Details
Main Authors: Yongchun Lü, Xiangrui Zeng, Xinhui Tian, Xiao Shi, Hui Wang, Xiaohui Zheng, Xiaodong Liu, Xiaofang Zhao, Xin Gao, Min Xu
Format: Article
Language:English
Published: BMC 2020-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-03680-6
id doaj-40ec31910d3842618ff41006438e9502
record_format Article
spelling doaj-40ec31910d3842618ff41006438e95022020-11-25T03:13:31ZengBMCBMC Bioinformatics1471-21052020-09-0121S1311810.1186/s12859-020-03680-6Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimationYongchun Lü0Xiangrui Zeng1Xinhui Tian2Xiao Shi3Hui Wang4Xiaohui Zheng5Xiaodong Liu6Xiaofang Zhao7Xin Gao8Min Xu9Institute of Computing Technology of the Chinese Academy of SciencesComputational Biology Department, School of Computer Science, Carnegie Mellon UniversityInstitute of Computing Technology of the Chinese Academy of SciencesInstitute of Computing Technology of the Chinese Academy of SciencesInstitute of Computing Technology of the Chinese Academy of SciencesInstitute of Computing Technology of the Chinese Academy of SciencesInstitute of Computing Technology of the Chinese Academy of SciencesInstitute of Computing Technology of the Chinese Academy of SciencesKing Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) DivisionComputational Biology Department, School of Computer Science, Carnegie Mellon UniversityAbstract Background Resolution estimation is the main evaluation criteria for the reconstruction of macromolecular 3D structure in the field of cryoelectron microscopy (cryo-EM). At present, there are many methods to evaluate the 3D resolution for reconstructed macromolecular structures from Single Particle Analysis (SPA) in cryo-EM and subtomogram averaging (SA) in electron cryotomography (cryo-ET). As global methods, they measure the resolution of the structure as a whole, but they are inaccurate in detecting subtle local changes of reconstruction. In order to detect the subtle changes of reconstruction of SPA and SA, a few local resolution methods are proposed. The mainstream local resolution evaluation methods are based on local Fourier shell correlation (FSC), which is computationally intensive. However, the existing resolution evaluation methods are based on multi-threading implementation on a single computer with very poor scalability. Results This paper proposes a new fine-grained 3D array partition method by key-value format in Spark. Our method first converts 3D images to key-value data (K-V). Then the K-V data is used for 3D array partitioning and data exchange in parallel. So Spark-based distributed parallel computing framework can solve the above scalability problem. In this distributed computing framework, all 3D local FSC tasks are simultaneously calculated across multiple nodes in a computer cluster. Through the calculation of experimental data, 3D local resolution evaluation algorithm based on Spark fine-grained 3D array partition has a magnitude change in computing speed compared with the mainstream FSC algorithm under the condition that the accuracy remains unchanged, and has better fault tolerance and scalability. Conclusions In this paper, we proposed a K-V format based fine-grained 3D array partition method in Spark to parallel calculating 3D FSC for getting a 3D local resolution density map. 3D local resolution density map evaluates the three-dimensional density maps reconstructed from single particle analysis and subtomogram averaging. Our proposed method can significantly increase the speed of the 3D local resolution evaluation, which is important for the efficient detection of subtle variations among reconstructed macromolecular structures.http://link.springer.com/article/10.1186/s12859-020-03680-63D local Fourier shell correlation3D local resolution mapKey-value dataSpark3D array partition
collection DOAJ
language English
format Article
sources DOAJ
author Yongchun Lü
Xiangrui Zeng
Xinhui Tian
Xiao Shi
Hui Wang
Xiaohui Zheng
Xiaodong Liu
Xiaofang Zhao
Xin Gao
Min Xu
spellingShingle Yongchun Lü
Xiangrui Zeng
Xinhui Tian
Xiao Shi
Hui Wang
Xiaohui Zheng
Xiaodong Liu
Xiaofang Zhao
Xin Gao
Min Xu
Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
BMC Bioinformatics
3D local Fourier shell correlation
3D local resolution map
Key-value data
Spark
3D array partition
author_facet Yongchun Lü
Xiangrui Zeng
Xinhui Tian
Xiao Shi
Hui Wang
Xiaohui Zheng
Xiaodong Liu
Xiaofang Zhao
Xin Gao
Min Xu
author_sort Yongchun Lü
title Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
title_short Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
title_full Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
title_fullStr Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
title_full_unstemmed Spark-based parallel calculation of 3D fourier shell correlation for macromolecule structure local resolution estimation
title_sort spark-based parallel calculation of 3d fourier shell correlation for macromolecule structure local resolution estimation
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2020-09-01
description Abstract Background Resolution estimation is the main evaluation criteria for the reconstruction of macromolecular 3D structure in the field of cryoelectron microscopy (cryo-EM). At present, there are many methods to evaluate the 3D resolution for reconstructed macromolecular structures from Single Particle Analysis (SPA) in cryo-EM and subtomogram averaging (SA) in electron cryotomography (cryo-ET). As global methods, they measure the resolution of the structure as a whole, but they are inaccurate in detecting subtle local changes of reconstruction. In order to detect the subtle changes of reconstruction of SPA and SA, a few local resolution methods are proposed. The mainstream local resolution evaluation methods are based on local Fourier shell correlation (FSC), which is computationally intensive. However, the existing resolution evaluation methods are based on multi-threading implementation on a single computer with very poor scalability. Results This paper proposes a new fine-grained 3D array partition method by key-value format in Spark. Our method first converts 3D images to key-value data (K-V). Then the K-V data is used for 3D array partitioning and data exchange in parallel. So Spark-based distributed parallel computing framework can solve the above scalability problem. In this distributed computing framework, all 3D local FSC tasks are simultaneously calculated across multiple nodes in a computer cluster. Through the calculation of experimental data, 3D local resolution evaluation algorithm based on Spark fine-grained 3D array partition has a magnitude change in computing speed compared with the mainstream FSC algorithm under the condition that the accuracy remains unchanged, and has better fault tolerance and scalability. Conclusions In this paper, we proposed a K-V format based fine-grained 3D array partition method in Spark to parallel calculating 3D FSC for getting a 3D local resolution density map. 3D local resolution density map evaluates the three-dimensional density maps reconstructed from single particle analysis and subtomogram averaging. Our proposed method can significantly increase the speed of the 3D local resolution evaluation, which is important for the efficient detection of subtle variations among reconstructed macromolecular structures.
topic 3D local Fourier shell correlation
3D local resolution map
Key-value data
Spark
3D array partition
url http://link.springer.com/article/10.1186/s12859-020-03680-6
work_keys_str_mv AT yongchunlu sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xiangruizeng sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xinhuitian sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xiaoshi sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT huiwang sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xiaohuizheng sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xiaodongliu sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xiaofangzhao sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT xingao sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
AT minxu sparkbasedparallelcalculationof3dfouriershellcorrelationformacromoleculestructurelocalresolutionestimation
_version_ 1724646420834680832