Abstract
In this work, we discuss the porting to the GPU platform of the latest production version of the Gyrokinetic Torodial Code (GTC), which is a petascale fusion simulation code using particle-in-cell method. New GPU parallel algorithms have been designed for the particle push and shift operations. The GPU version of the GTC code was benchmarked on up to 3072 nodes of the Tianhe-1A supercomputer, which shows about 2x–3x overall speedup comparing NVIDIA M2050 GPUs to Intel Xeon X5670 CPUs. Strong and weak scaling studies have been performed using actual production simulation parameters, providing insights into GTC’s scalability and bottlenecks on large GPU supercomputers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lin, Z., Hahm, T.S., Lee, W.W., Tang, W.M., White, R.B.: Turbulent Transport Reduction by Zonal Flows: Massively Parallel Simulations. Science 281, 1835 (1998)
Lin, Z., Holod, I., Chen, L., Diamond, P.H., Hahm, T.S., Ethier, S.: Wave-particle decorrelation and transport of anisotropic turbulence in collisionless plasmas. Phys. Rev. Lett. 99, 265003 (2007)
Zhang, W., Lin, Z., Chen, L.: Transport of Energetic Particles by Microturbulence in Magnetized Plasmas. Phys. Rev. Lett. 101, 095001 (2008)
Xiao, Y., Lin, Z.: Turbulent transport of trapped electron modes in collisionless plasmas. Phys. Rev. Lett. 103, 085004 (2009)
Xiao, Y., Lin, Z.: Convective motion in collisionless trapped electron mode turbulence. Phys. Plasmas 18, 110703 (2011)
Holod, I., Zhang, W.L., Xiao, Y., Lin, Z.: Electromagnetic formulation of global gyrokinetic particle simulation in toroidal geometry. Phys. Plasmas 16, 122307 (2009)
Zhang, H.S., Lin, Z., Holod, I., Wang, X., Xiao, Y., Zhang, W.L.: Gyrokinetic particle simulation of beta-induced Alfven eigenmode. Phys. Plasmas 17, 112505 (2010)
Zhang, W., Holod, I., Lin, Z., Xiao, Y.: Global Gyrokinetic Particle Simulation of Toroidal Alfven Eigenmodes Excited by Antenna and Fast Ions. Phys. Plasmas 19, 022507 (2012)
Deng, W., Lin, Z., Holod, I., Wang, Z., Xiao, Y., Zhang, H.: Linear properties of reversed shear Alfven eigenmodes in DIII-D tokamak. Nuclear Fusion 52, 043002 (2012)
Deng, W., Lin, Z., Holod, I.: Gyrokinetic simulation model for kinetic magnetohydrodynamic processes in magnetized plasmas. Nuclear Fusion 52, 023005 (2012)
Decyk, V.K., Singh, T.V.: Adaptable particle-in-cell algorithms for graphical processing units. Computer Physics Communications 182(3), 641–648 (2011)
Burau, H., Widera, R., Honig, W., Juckeland, G., Debus, A., Kluge, T., Schramm, U., Cowan, T.E., Sauerbrey, R., Bussmann, M.: PIConGPU: A Fully Relativistic Particle-in-Cell Code for a GPU Cluster. IEEE Transaction on Plasma Science 38(10), 2831–2839 (2010)
Stantchev, G., Dorland, W., Gumerov, N.: Fast parallel particle-to-grid interpolation for plasma PIC simulations on the GPU. Journal of Parallel and Distributed Computing 68(10), 1339–1349 (2008)
Rossinelli, D., Conti, C., Koumoutsakos, P.: Mesh-particle interpolations on graphics processing units and multicore central processing units. Philosophical Transactions of the Royal Society 369, 2164–2175 (2011)
Madduri, K., Ibrahim, K.Z., Williams, S., Im, E.J., Ethier, S., Shalf, J., Oliker, L.: Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (2011)
Madduri, K., Im, E.J., Ibrahim, K.Z., Williams, S., Ethier, S., Oliker, L.: Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms. Parallel Computing 37(9), 501–520 (2011)
NVIDIA Corporation, CUDA Programming Guide. In: CUDA Development Toolkit (2011)
Sengupta, S., Harris, M., Zhang, Y., Owens, J.: Scan Primitives for GPU Computing. In: Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware (2007)
Billeter, M., Olsson, O., Assarsson, U.: Efficient Stream Compaction on Wide SIMD Many-Core Architectures. In: High Performance Graphics (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Meng, X. et al. (2013). Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code and Large-Scale Performance Test on TH-1A. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2013. Lecture Notes in Computer Science, vol 7905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38750-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-38750-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38749-4
Online ISBN: 978-3-642-38750-0
eBook Packages: Computer ScienceComputer Science (R0)