Hardware/Software Co-design of Global Cloud System Resolving

We present an analysis of the performance aspects of an atmospheric general circulation model at the ultra-high resolution required to resolve individual cloud systems and describe alternative technological paths to realize the integration of such a model in the relatively near future. Due to a supe...

Full description

Bibliographic Details
Main Authors: Michael Wehner, Marghoob Mohiyuddin, David Randall, Woo-Sun Yang, Huiro Miura, Norman Miller, Ross Heikes, Leonid Oliker, John Shalf, Shoaib Kamil, Celal Konor, David Donofrio, Leroy A. Drummond
Format: Article
Language:English
Published: American Geophysical Union (AGU) 2011-10-01
Series:Journal of Advances in Modeling Earth Systems
Subjects:
Online Access:http://james.agu.org/index.php/JAMES/article/view/v3n12
Description
Summary:We present an analysis of the performance aspects of an atmospheric general circulation model at the ultra-high resolution required to resolve individual cloud systems and describe alternative technological paths to realize the integration of such a model in the relatively near future. Due to a superlinear scaling of the computational burden dictated by the Courant stability criterion, the solution of the equations of motion dominate the calculation at these ultra-high resolutions. From this extrapolation, it is estimated that a credible kilometer scale atmospheric model would require a sustained computational rate of at least 28 Petaflop/s to provide scientifically useful climate simulations. Our design study portends an alternate strategy for practical power-efficient implementations of next-generation ultra-scale systems. We demonstrate that hardware/software co-design of low-power embedded processor technology could be exploited to design a custom machine tailored to ultra-high resolution climate model specifications at relatively affordable cost and power considerations. A strawman machine design is presented consisting of in excess of 20 million processing elements that effectively exploits forthcoming many-core chips. The system pushes the limits of domain decomposition to increase explicit parallelism, and suggests that functional partitioning of sub-components of the climate code (much like the coarse-grained partitioning of computation between the atmospheric, ocean, land, and ice components of current coupled models) may be necessary for future performance scaling.</span></p>
ISSN:1942-2466