The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application
<p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn&...
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2021-05-01
|
Series: | Geoscientific Model Development |
Online Access: | https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf |
id |
doaj-3046919fb99e46dda01dd9cdfc0b5d0d |
---|---|
record_format |
Article |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
P. Wang P. Wang J. Jiang J. Jiang P. Lin P. Lin M. Ding J. Wei F. Zhang L. Zhao Y. Li Z. Yu W. Zheng W. Zheng Y. Yu Y. Yu X. Chi X. Chi H. Liu H. Liu |
spellingShingle |
P. Wang P. Wang J. Jiang J. Jiang P. Lin P. Lin M. Ding J. Wei F. Zhang L. Zhao Y. Li Z. Yu W. Zheng W. Zheng Y. Yu Y. Yu X. Chi X. Chi H. Liu H. Liu The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application Geoscientific Model Development |
author_facet |
P. Wang P. Wang J. Jiang J. Jiang P. Lin P. Lin M. Ding J. Wei F. Zhang L. Zhao Y. Li Z. Yu W. Zheng W. Zheng Y. Yu Y. Yu X. Chi X. Chi H. Liu H. Liu |
author_sort |
P. Wang |
title |
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application |
title_short |
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application |
title_full |
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application |
title_fullStr |
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application |
title_full_unstemmed |
The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application |
title_sort |
gpu version of lasg/iap climate system ocean model version 3 (licom3) under the heterogeneous-compute interface for portability (hip) framework and its large-scale application |
publisher |
Copernicus Publications |
series |
Geoscientific Model Development |
issn |
1991-959X 1991-9603 |
publishDate |
2021-05-01 |
description |
<p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="373dde63ac63417a7e30c3aa9e00e973"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00001.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00001.png"/></svg:svg></span></span>) global ocean general circulation model
with graphics processing unit (GPU) code implementations is developed based on
the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a
heterogeneous-compute interface for portability (HIP) framework. The dynamic
core and physics package of LICOM3 are both ported to the GPU, and
three-dimensional parallelization (also partitioned in the vertical direction) is
applied. The HIP version of LICOM3 (LICOM3-HIP) is 42 times faster than the
same number of CPU cores when 384 AMD GPUs and CPU cores are used. LICOM3-HIP
has excellent scalability; it can still obtain a speedup of more than 4 on
9216 <span class="inline-formula">GPUs</span> compared to 384 <span class="inline-formula">GPUs</span>. In this phase, we successfully
performed a test of <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0cf7697ff56d2784b0bb3e419043e427"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00002.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00002.png"/></svg:svg></span></span> LICOM3-HIP using 6550 nodes and
26 200 <span class="inline-formula">GPUs</span>, and on a large scale, the model's speed was increased to
approximately 2.72 simulated years per day (SYPD). By putting almost all the
computation processes inside GPUs, the time cost of data transfer between CPUs
and GPUs was reduced, resulting in high performance. Simultaneously, a 14-year
spin-up integration following phase 2 of the Ocean Model Intercomparison
Project (OMIP-2) protocol of surface forcing was performed, and preliminary
results were evaluated. We found that the model results had little difference
from the CPU version. Further comparison with observations and
lower-resolution LICOM3 results suggests that the <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="cb58d9ccf1e27ce2a56318f89fca3ad5"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00003.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00003.png"/></svg:svg></span></span> LICOM3-HIP
can reproduce the observations and produce many smaller-scale activities, such
as submesoscale eddies and frontal-scale structures.</p> |
url |
https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf |
work_keys_str_mv |
AT pwang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT pwang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jjiang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jjiang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT plin thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT plin thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT mding thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jwei thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT fzhang thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT lzhao thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yli thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT zyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT wzheng thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT wzheng thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yyu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT xchi thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT xchi thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT hliu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT hliu thegpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT pwang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT pwang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jjiang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jjiang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT plin gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT plin gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT mding gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT jwei gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT fzhang gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT lzhao gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yli gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT zyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT wzheng gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT wzheng gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT yyu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT xchi gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT xchi gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT hliu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication AT hliu gpuversionoflasgiapclimatesystemoceanmodelversion3licom3undertheheterogeneouscomputeinterfaceforportabilityhipframeworkanditslargescaleapplication |
_version_ |
1721437470951735296 |
spelling |
doaj-3046919fb99e46dda01dd9cdfc0b5d0d2021-05-18T10:42:22ZengCopernicus PublicationsGeoscientific Model Development1991-959X1991-96032021-05-01142781279910.5194/gmd-14-2781-2021The GPU version of LASG/IAP Climate System Ocean Model version 3 (LICOM3) under the heterogeneous-compute interface for portability (HIP) framework and its large-scale application P. Wang0P. Wang1J. Jiang2J. Jiang3P. Lin4P. Lin5M. Ding6J. Wei7F. Zhang8L. Zhao9Y. Li10Z. Yu11W. Zheng12W. Zheng13Y. Yu14Y. Yu15X. Chi16X. Chi17H. Liu18H. Liu19State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaCenter for Monsoon System Research (CMSR), Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaComputer Network Information Center, Chinese Academy of Sciences, Beijing 100190, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, ChinaState Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG), Institute of Atmospheric Physics (IAP), Chinese Academy of Sciences (CAS), Beijing 100029, ChinaUniversity of Chinese Academy of Sciences, Beijing 100049, China<p>A high-resolution (<span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M1" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="373dde63ac63417a7e30c3aa9e00e973"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00001.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00001.png"/></svg:svg></span></span>) global ocean general circulation model with graphics processing unit (GPU) code implementations is developed based on the LASG/IAP Climate System Ocean Model version 3 (LICOM3) under a heterogeneous-compute interface for portability (HIP) framework. The dynamic core and physics package of LICOM3 are both ported to the GPU, and three-dimensional parallelization (also partitioned in the vertical direction) is applied. The HIP version of LICOM3 (LICOM3-HIP) is 42 times faster than the same number of CPU cores when 384 AMD GPUs and CPU cores are used. LICOM3-HIP has excellent scalability; it can still obtain a speedup of more than 4 on 9216 <span class="inline-formula">GPUs</span> compared to 384 <span class="inline-formula">GPUs</span>. In this phase, we successfully performed a test of <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M4" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="0cf7697ff56d2784b0bb3e419043e427"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00002.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00002.png"/></svg:svg></span></span> LICOM3-HIP using 6550 nodes and 26 200 <span class="inline-formula">GPUs</span>, and on a large scale, the model's speed was increased to approximately 2.72 simulated years per day (SYPD). By putting almost all the computation processes inside GPUs, the time cost of data transfer between CPUs and GPUs was reduced, resulting in high performance. Simultaneously, a 14-year spin-up integration following phase 2 of the Ocean Model Intercomparison Project (OMIP-2) protocol of surface forcing was performed, and preliminary results were evaluated. We found that the model results had little difference from the CPU version. Further comparison with observations and lower-resolution LICOM3 results suggests that the <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML" id="M6" display="inline" overflow="scroll" dspmath="mathml"><mrow><mn mathvariant="normal">1</mn><mo>/</mo><mn mathvariant="normal">20</mn><msup><mi/><mo>∘</mo></msup></mrow></math><span><svg:svg xmlns:svg="http://www.w3.org/2000/svg" width="31pt" height="14pt" class="svg-formula" dspmath="mathimg" md5hash="cb58d9ccf1e27ce2a56318f89fca3ad5"><svg:image xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="gmd-14-2781-2021-ie00003.svg" width="31pt" height="14pt" src="gmd-14-2781-2021-ie00003.png"/></svg:svg></span></span> LICOM3-HIP can reproduce the observations and produce many smaller-scale activities, such as submesoscale eddies and frontal-scale structures.</p>https://gmd.copernicus.org/articles/14/2781/2021/gmd-14-2781-2021.pdf |