A Performance Modelling-Driven Approach to Hardware Resource Scaling

Rodrigues, Alexandre; Sousa, Leonel; Ilic, Aleksandar

doi:10.1007/978-3-031-48803-0_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14352))

Included in the following conference series:

European Conference on Parallel Processing

356 Accesses

Abstract

The continuous demand for higher computational performance and the stagnating developments in the general purpose processor landscape have led to a surge in interest for highly specialized and efficient hardware. Combined with the rising popularity of parameterizable hardware, a new opportunity to optimize these architectures for particular workloads arises, largely driven by the RISC-V Instruction Set Architecture (ISA). This work present an application-specific optimization methodology for general purpose processors, enabling the development of architectures which are faster and more efficient for their designated workloads. Driven by the Cache-Aware Roofline Model (CARM) insights, the methodology guides the configuration of the memory and computational subsystems of the processor. We apply this methodology to two applications, demonstrating up to a $2.67\times $ performance increase and a $1.34\times $ improvement to energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Novel Multi-level Integrated Roofline Model Approach for Performance Characterization

Performance Modelling and Dynamic Scheduling on Heterogeneous-ISA Multi-core Architectures

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis

References

PolyBench/C. https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/
Agrawal, R., et al.: FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption (2022). arXiv:2207.11872
Bobda, C., et al.: The future of FPGA acceleration in datacenters and the cloud. ACM Trans. Reconfigurable Technol. Syst. 15(3), 1–42 (2022)
Article Google Scholar
Cavalcante, M., et al.: Ara: a 1-GHz+ scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI. IEEE Trans. Very Large Scale Integr. Syst. 28(2), 530–543 (2020)
Article Google Scholar
Chen, X., et al.: ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines. Technical report, arXiv:2203.02676, arXiv (2022)
Ilic, A., Pratas, F., Sousa, L.: Cache-aware roofline model: upgrading the loft. IEEE Comput. Archit. Lett. 13(1), 21–24 (2014)
Article Google Scholar
Kolodziej, S., et al.: The SuiteSparse matrix collection website interface. J. Open Source Softw. 4(35), 1244 (2019)
Article Google Scholar
Kulkarni, A.V., Barde, C.R.: A Survey on Performance Modelling and Optimization Techniques for SpMV on GPUs, vol. 5 (2014)
Google Scholar
Li, S., et al.: McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: Proceedings of the IEEE/ACM International Symposium on Microarchitecture, pp. 469–480. ACM, New York (2009)
Google Scholar
Li, S., Liu, D., Liu, W.: Optimized data reuse via reordering for sparse matrix-vector multiplication on FPGAs. In: IEEE/ACM International Conference on Computer Aided Design (ICCAD), Munich, Germany, pp. 1–9. IEEE (2021)
Google Scholar
Lowe-Power, J., et al.: The gem5 Simulator: V20.0+. arXiv:2007.03152 (2020)
Mantovani, F., et al.: Software Development Vehicles to enable extended and early co-design: a RISC-V and HPC case of study (2023). arXiv:2306.01797
Marques, D., et al.: Performance analysis with cache-aware roofline model in intel advisor. In: 2017 International Conference on High Performance Computing & Simulation (HPCS), pp. 898–907 (2017)
Google Scholar
Rodrigues, A., Ilic, A., Sousa, L.: Performance modelling-driven optimization of RISC-V hardware for efficient SpMV. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing. ISC High Performance 2023. LNCS, vol. 13999, pp. 486–499. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40843-4_36
Chapter Google Scholar
Sato, M., et al.: Co-design for A64FX manycore processor and “Fugaku”. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, pp. 1–15. IEEE (2020)
Google Scholar
Shalf, J.: The future of computing beyond Moore’s Law. Philos. Trans. Royal Soc. A Math. Phys. Eng. Sci. 378(2166), 20190061 (2020)
MathSciNet Google Scholar
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for floating-point programs and multicore architectures. Technical report, 1407078 (2009)
Google Scholar
Zhao, J., et al.: SonicBOOM: The 3rd Generation Berkeley Out-of-Order Machine, p. 7 (2020)
Google Scholar

Download references

Acknowledgements

This project has received funding from the European High Performance Computing Joint Undertaking (JU) under Framework Partnership Agreement No 800928 and Specific Grant Agreement No 101036168 (EPI SGA2), Grant Agreement No 956213 (SparCity) and Grant Agreement No 101092877 (SYCLOPS). It also received funding from FCT (Fundação para a Ciência e a Tecnologia, Portugal), through the UIDB/50021/2020 project.

Author information

Authors and Affiliations

INESC-ID, Instituto Superior Tecnico, Universidade de Lisboa, Rua Alves Redol, 9, 1000-029, Lisboa, Portugal
Alexandre Rodrigues, Leonel Sousa & Aleksandar Ilic

Authors

Alexandre Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar
Leonel Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Aleksandar Ilic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexandre Rodrigues .

Editor information

Editors and Affiliations

University of Cyprus, Nicosia, Cyprus
Demetris Zeinalipour
University of Santiago de Compostela, Santiago de Compostela, Spain
Dora Blanco Heras
University of Cyprus, Nicosia, Cyprus
George Pallis
Cyprus University of Technology, Limassol, Cyprus
Herodotos Herodotou
University of Nicosia, Nicosia, Cyprus
Demetris Trihinas
Inria, Nantes, France
Daniel Balouek
Louisiana State University, Baton Rouge, LA, USA
Patrick Diehl
Karlsruhe Institute of Technology, Karlsruhe, Germany
Terry Cojean
Ludwig-Maximilians-Universität, Munich, Germany
Karl Fürlinger
Roskilde University, Roskilde, Denmark
Maja Hanne Kirkeby
Bank of Italy, Rome, Italy
Matteo Nardelli
Roma Tre University, Rome, Italy
Pierangelo Di Sanzo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rodrigues, A., Sousa, L., Ilic, A. (2024). A Performance Modelling-Driven Approach to Hardware Resource Scaling. In: Zeinalipour, D., et al. Euro-Par 2023: Parallel Processing Workshops. Euro-Par 2023. Lecture Notes in Computer Science, vol 14352. Springer, Cham. https://doi.org/10.1007/978-3-031-48803-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-48803-0_15
Published: 14 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48802-3
Online ISBN: 978-3-031-48803-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Performance Modelling-Driven Approach to Hardware Resource Scaling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Multi-level Integrated Roofline Model Approach for Performance Characterization

Performance Modelling and Dynamic Scheduling on Heterogeneous-ISA Multi-core Architectures

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Performance Modelling-Driven Approach to Hardware Resource Scaling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Novel Multi-level Integrated Roofline Model Approach for Performance Characterization

Performance Modelling and Dynamic Scheduling on Heterogeneous-ISA Multi-core Architectures

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation