Abstract
Graphics processing units (GPU) are increasingly being used for general purpose computing. We present implementations of large integer modular exponentiation, the core of public-key cryptosystems such as RSA, on a DirectX 10 compliant GPU. DirectX 10 compliant graphics processors are the latest generation of GPU architecture, which provide increased programming flexibility and support for integer operations. We present high performance modular exponentiation implementations based on integers represented in both standard radix form and residue number system form. We show how a GPU implementation of a 1024-bit RSA decrypt primitive can outperform a comparable CPU implementation by up to 4 times and also improve the performance of previous GPU implementations by decreasing latency by up to 7 times and doubling throughput. We present how an adaptive approach to modular exponentiation involving implementations based on both a radix and a residue number system gives the best all-around performance on the GPU both in terms of latency and throughput. We also highlight the usage criteria necessary to allow the GPU to reach peak performance on public key cryptographic operations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Nvidia CUDA Programming Guide, Version 2.0 (2008)
Microsoft, Direct X Technology, http://msdn.microsoft.com/directx/
Nvidia Corporation, “CUDA”, http://developer.nvidia.com/object/cuda.html
Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography. CRC Press, Boca Raton (1996) ISBN 0-8493-8523-7
Montgomery, P.L.: Modular Multiplication Without Trial Division. Mathematics of Computation 44, 519–521 (1985)
Cook, D., Ioannidis, J., Keromytis, A., Luck, J.: CryptoGraphics: Secret Key Cryptography Using Graphics Cards. In: Menezes, A. (ed.) CT-RSA 2005. LNCS, vol. 3376, pp. 334–350. Springer, Heidelberg (2005)
Harrison, O., Waldron, J.: AES encryption implementation and analysis on commodity graphics processing units. In: Paillier, P., Verbauwhede, I. (eds.) CHES 2007. LNCS, vol. 4727, pp. 209–226. Springer, Heidelberg (2007)
Yang, J., Goodman, J.: Symmetric Key Cryptography on Modern Graphics Hardware. In: Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 249–264. Springer, Heidelberg (2007)
Harrison, O., Waldron, J.: Practical Symmetric Key Cryptography on Modern Graphics Hardware. In: 17th USENIX Security Symposium, San Jose, CA, July 28 - August 1 (2008)
Moss, A., Page, D., Smart, N.P.: Toward Acceleration of RSA Using 3D Graphics Hardware. In: 11th IMA International Conference on Cryptography and Coding, Cirencester, UK, December 18-20 (2007)
Fleissner, S.: GPU-Accelerated Montgomery Exponentiation. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4487, pp. 213–220. Springer, Heidelberg (2007)
AMD 64 RSA Benchmarks, http://www.cryptopp.com/benchmarks-amd64.html
Knuth, D.E.: The Art of Computer Programming, 3rd edn., vol. 2. Addison-Wesley, Reading (1997)
OpenSSL Open Source Project, http://www.openssl.org/
Szerwinski, R., Güneysu, T.: Exploiting the Power of GPUs for Asymmetric Cryptography. In: Oswald, E., Rohatgi, P. (eds.) CHES 2008. LNCS, vol. 5154, pp. 79–99. Springer, Heidelberg (2008)
Posch, K.C., Posch, R.: Modulo Reduction in Residues Numbers Systems. IEEE Trans. on Parallel and Distributed Systems 6(5), 449–454 (1995)
Kawamura, S., Koike, M., Sano, F., Shimbo, A.: Cox-Rower Architecture for Fast Parallel Montgomery Multiplication. In: Preneel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 523–538. Springer, Heidelberg (2000)
Szabo, N.S., Tanaka, R.I.: Residue Arithmetic and its Applications to Computer Technology. McGraw-Hill, New York (1967)
Posch, K.C., Posch, R.: Base Extension Using a Convolution Sum in Residue Number Systems. Computing 50, 93–104 (1993)
Granlund, T., Montgomery, P.: Division by Invariant Integers using Multiplication. In: SIGPLAN 1994 Conference on Programming Language Design and Implementation, Orlando, Florida (June 1994)
Quisquater, J.-J., Couvreur, C.: Fast Decipherment Algorithm for RSA Public-Key Cryptosystem. Electronics Letters 18(21), 905–907 (1982)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Harrison, O., Waldron, J. (2009). Efficient Acceleration of Asymmetric Cryptography on Graphics Hardware. In: Preneel, B. (eds) Progress in Cryptology – AFRICACRYPT 2009. AFRICACRYPT 2009. Lecture Notes in Computer Science, vol 5580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02384-2_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-02384-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02383-5
Online ISBN: 978-3-642-02384-2
eBook Packages: Computer ScienceComputer Science (R0)