The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application

<p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn&...

Full description

Bibliographic Details
Main Authors: P. Wang, J. Jiang, P. Lin, M. Ding, J. Wei, F. Zhang, L. Zhao, Y. Li, Z. Yu, W. Zheng, Y. Yu, X. Chi, H. Liu
Format: Article
Language:English
Published: Copernicus Publications 2021-05-01
Series:Geoscientific Model Development
Online Access:https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf
id doaj-3046919fb99e46dda01dd9cdfc0b5d0d
record_format Article
collection DOAJ
language English
format Article
sources DOAJ
author P. Wang
P. Wang
J. Jiang
J. Jiang
P. Lin
P. Lin
M. Ding
J. Wei
F. Zhang
L. Zhao
Y. Li
Z. Yu
W. Zheng
W. Zheng
Y. Yu
Y. Yu
X. Chi
X. Chi
H. Liu
H. Liu
spellingShingle P. Wang
P. Wang
J. Jiang
J. Jiang
P. Lin
P. Lin
M. Ding
J. Wei
F. Zhang
L. Zhao
Y. Li
Z. Yu
W. Zheng
W. Zheng
Y. Yu
Y. Yu
X. Chi
X. Chi
H. Liu
H. Liu
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
Geoscientific Model Development
author_facet P. Wang
P. Wang
J. Jiang
J. Jiang
P. Lin
P. Lin
M. Ding
J. Wei
F. Zhang
L. Zhao
Y. Li
Z. Yu
W. Zheng
W. Zheng
Y. Yu
Y. Yu
X. Chi
X. Chi
H. Liu
H. Liu
author_sort P. Wang
title The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
title_short The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
title_full The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
title_fullStr The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
title_full_unstemmed The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
title_sort gpu version of lasg/iap climate system ocean model version 3 (licom3) under the heterogeneous-compute interface for portability (hip) framework and its large-scale application
publisher Copernicus Publications
series Geoscientific Model Development
issn 1991-959X
1991-9603
publishDate 2021-05-01
description <p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="373dde63ac63417a7e30c3aa9e00e973"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00001.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00001.png"/></svg:svg></span></span>) global ocean general circulation model with graphics processing unit (GPU) code implementations is developed based on the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a heterogeneous-compute interface for portability (HIP) framework. The dynamic core and physics package of LICOM3 are both ported to the GPU, and three-dimensional parallelization (also partitioned in the vertical direction) is applied. The HIP version of LICOM3 (LICOM3-HIP) is 42 times faster than the same number of CPU cores when 384 AMD GPUs and CPU cores are used. LICOM3-HIP has excellent scalability; it can still obtain a speedup of more than 4 on 9216 <span class="inline-formula">GPUs</span> compared to 384 <span class="inline-formula">GPUs</span>. In this phase, we successfully performed a test of <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0cf7697ff56d2784b0bb3e419043e427"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00002.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00002.png"/></svg:svg></span></span> LICOM3-HIP using 6550 nodes and 26 200 <span class="inline-formula">GPUs</span>, and on a large scale, the model's speed was increased to approximately 2.72 simulated years per day (SYPD). By putting almost all the computation processes inside GPUs, the time cost of data transfer between CPUs and GPUs was reduced, resulting in high performance. Simultaneously, a 14-year spin-up integration following phase 2 of the Ocean Model Intercomparison Project (OMIP-2) protocol of surface forcing was performed, and preliminary results were evaluated. We found that the model results had little difference from the CPU version. Further comparison with observations and lower-resolution LICOM3 results suggests that the <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="cb58d9ccf1e27ce2a56318f89fca3ad5"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00003.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00003.png"/></svg:svg></span></span> LICOM3-HIP can reproduce the observations and produce many smaller-scale activities, such as submesoscale eddies and frontal-scale structures.</p>
url https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf
work_keys_str_mv AT pwang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT pwang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jjiang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jjiang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT plin thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT plin thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT mding thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jwei thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT fzhang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT lzhao thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yli thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT zyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT wzheng thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT wzheng thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT xchi thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT xchi thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT hliu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT hliu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT pwang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT pwang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jjiang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jjiang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT plin gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT plin gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT mding gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT jwei gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT fzhang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT lzhao gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yli gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT zyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT wzheng gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT wzheng gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT yyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT xchi gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT xchi gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT hliu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
AT hliu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication
_version_ 1721437470951735296
spelling doaj-3046919fb99e46dda01dd9cdfc0b5d0d2021-05-18T10:42:22ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032021-05-01142781279910.5194/gmd-14-2781-2021The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application P. Wang0P. Wang1J. Jiang2J. Jiang3P. Lin4P. Lin5M. Ding6J. Wei7F. Zhang8L. Zhao9Y. Li10Z. Yu11W. Zheng12W. Zheng13Y. Yu14Y. Yu15X. Chi16X. Chi17H. Liu18H. Liu19State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaCenter for Monsoon System Research (CMSR), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, China<p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="373dde63ac63417a7e30c3aa9e00e973"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00001.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00001.png"/></svg:svg></span></span>) global ocean general circulation model with graphics processing unit (GPU) code implementations is developed based on the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a heterogeneous-compute interface for portability (HIP) framework. The dynamic core and physics package of LICOM3 are both ported to the GPU, and three-dimensional parallelization (also partitioned in the vertical direction) is applied. The HIP version of LICOM3 (LICOM3-HIP) is 42 times faster than the same number of CPU cores when 384 AMD GPUs and CPU cores are used. LICOM3-HIP has excellent scalability; it can still obtain a speedup of more than 4 on 9216 <span class="inline-formula">GPUs</span> compared to 384 <span class="inline-formula">GPUs</span>. In this phase, we successfully performed a test of <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0cf7697ff56d2784b0bb3e419043e427"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00002.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00002.png"/></svg:svg></span></span> LICOM3-HIP using 6550 nodes and 26 200 <span class="inline-formula">GPUs</span>, and on a large scale, the model's speed was increased to approximately 2.72 simulated years per day (SYPD). By putting almost all the computation processes inside GPUs, the time cost of data transfer between CPUs and GPUs was reduced, resulting in high performance. Simultaneously, a 14-year spin-up integration following phase 2 of the Ocean Model Intercomparison Project (OMIP-2) protocol of surface forcing was performed, and preliminary results were evaluated. We found that the model results had little difference from the CPU version. Further comparison with observations and lower-resolution LICOM3 results suggests that the <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="cb58d9ccf1e27ce2a56318f89fca3ad5"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00003.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00003.png"/></svg:svg></span></span> LICOM3-HIP can reproduce the observations and produce many smaller-scale activities, such as submesoscale eddies and frontal-scale structures.</p>https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf