Abstract
We present the performance analysis of OpenCL kernels for three recently introduced many-core accelerator architectures: Intel Xeon Phi coprocessor and NVIDIA Kepler and Fermi GPUs. We use a case study of finite element numerical integration, a practically important and theoretically interesting algorithm used in scientific computing. We design a single parametrized kernel for all three architectures and test the performance obtained in numerical tests. We indicate possible further, architecture dependent, optimizations and draw conclusions on the performance portability for different accelerator architectures and OpenCL programming model.
Chapter PDF
Similar content being viewed by others
Keywords
References
Banaś, K., Płaszewski, P., Macioł, P.: Numerical integration on GPUs for higher order finite elements. Computers and Mathematics with Applications 67(6), 1319–1344 (2014)
Becker, E., Carey, G., Oden, J.: Finite Elements. An Introduction. Prentice Hall, Englewood Cliffs (1981)
Benkner, S., Pllana, S., Traff, J., Tsigas, P., Dolinsky, U., Augonnet, C., Bachmayer, B., Kessler, C., Moloney, D., Osipov, V.: Peppher: Efficient and productive usage of hybrid computing systems. IEEE Micro 31(5), 28–41 (2011)
Cecka, C., Lew, A.J., Darve, E.: Assembly of finite element methods on graphics processors. International Journal for Numerical Methods in Engineering 85(5), 640–669 (2011), http://dx.doi.org/10.1002/nme.2989
Goto, K., van de Geijn, R.A.: Anatomy of high-performance matrix multiplication. ACM Trans. Math. Softw. 34(3), 12:1–12:25 (2008), http://doi.acm.org/10.1145/1356052.1356053
Group, K.O.W.: The OpenCL Specification, version 1.1 (2010), http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf
Intel: Intel SDK for OpenCL Applications XE 2013 R3. User’s Guide (2013)
Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming, 1st edn. Morgan Kaufmann (2013)
Krużel, F., Banaś, K.: Vectorized OpenCL implementation of numerical integration for higher order finite elements. Computers and Mathematics with Applications 66(10), 2030–2044 (2013)
Markall, G.R., Ham, D.A., Kelly, P.H.: Towards generating optimised finite element solvers for gpus from high-level specifications. Procedia Computer Science 1(1), 1815–1823 (2010); iCCS 2010
Marr, D.T., Binns, F., Hill, D.L., Hinton, G., Koufaty, D.A., Miller, A.J., Upton, M.: Hyper-Threading Technology Architecture and Microarchitecture. Intel Technology Journal 6(1), 4–15 (2002)
NVIDIA: NVIDIA CUDA C Programming Guide Version 5.0 (2012)
Reguly, I., Giles, M.: Finite element algorithms and data structures on graphical processing units. International Journal of Parallel Programming, 1–37 (2013), http://dx.doi.org/10.1007/s10766-013-0301-6
Rul, S., Vandierendonck, H., D’Haene, J., De Bosschere, K.: An experimental study on performance portability of opencl kernels. In: Application Accelerators in High Performance Computing, 2010 Symposium, Papers, Knoxville, TN, USA, p. 3 (2010)
Top500, http://www.top500.org
Wienke, S., an Mey, D., Müller, M.S.: Accelerators for technical computing: Is it worth the pain? A TCO perspective. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 330–342. Springer, Heidelberg (2013)
Williams, S., Waterman, A., Patterson, D.: Roofline: An insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009), http://doi.acm.org/10.1145/1498765.1498785
Yuen, D., Wang, L., Chi, X., Johnsson, L., Ge, W., Shi, Y. (eds.): GPU Solutions to Multi-scale Problems in Science and Engineering. Springer (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Banaś, K., Krużel, F. (2014). OpenCL Performance Portability for Xeon Phi Coprocessor and NVIDIA GPUs: A Case Study of Finite Element Numerical Integration. In: Lopes, L., et al. Euro-Par 2014: Parallel Processing Workshops. Euro-Par 2014. Lecture Notes in Computer Science, vol 8806. Springer, Cham. https://doi.org/10.1007/978-3-319-14313-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-14313-2_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14312-5
Online ISBN: 978-3-319-14313-2
eBook Packages: Computer ScienceComputer Science (R0)