iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://unpaywall.org/10.1007/978-3-031-48803-0_15
A Performance Modelling-Driven Approach to Hardware Resource Scaling | SpringerLink
Skip to main content

A Performance Modelling-Driven Approach to Hardware Resource Scaling

  • Conference paper
  • First Online:
Euro-Par 2023: Parallel Processing Workshops (Euro-Par 2023)

Abstract

The continuous demand for higher computational performance and the stagnating developments in the general purpose processor landscape have led to a surge in interest for highly specialized and efficient hardware. Combined with the rising popularity of parameterizable hardware, a new opportunity to optimize these architectures for particular workloads arises, largely driven by the RISC-V Instruction Set Architecture (ISA). This work present an application-specific optimization methodology for general purpose processors, enabling the development of architectures which are faster and more efficient for their designated workloads. Driven by the Cache-Aware Roofline Model (CARM) insights, the methodology guides the configuration of the memory and computational subsystems of the processor. We apply this methodology to two applications, demonstrating up to a \(2.67\times \) performance increase and a \(1.34\times \) improvement to energy efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. PolyBench/C. https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/

  2. Agrawal, R., et al.: FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption (2022). arXiv:2207.11872

  3. Bobda, C., et al.: The future of FPGA acceleration in datacenters and the cloud. ACM Trans. Reconfigurable Technol. Syst. 15(3), 1–42 (2022)

    Article  Google Scholar 

  4. Cavalcante, M., et al.: Ara: a 1-GHz+ scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI. IEEE Trans. Very Large Scale Integr. Syst. 28(2), 530–543 (2020)

    Article  Google Scholar 

  5. Chen, X., et al.: ReGraph: Scaling Graph Processing on HBM-enabled FPGAs with Heterogeneous Pipelines. Technical report, arXiv:2203.02676, arXiv (2022)

  6. Ilic, A., Pratas, F., Sousa, L.: Cache-aware roofline model: upgrading the loft. IEEE Comput. Archit. Lett. 13(1), 21–24 (2014)

    Article  Google Scholar 

  7. Kolodziej, S., et al.: The SuiteSparse matrix collection website interface. J. Open Source Softw. 4(35), 1244 (2019)

    Article  Google Scholar 

  8. Kulkarni, A.V., Barde, C.R.: A Survey on Performance Modelling and Optimization Techniques for SpMV on GPUs, vol. 5 (2014)

    Google Scholar 

  9. Li, S., et al.: McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: Proceedings of the IEEE/ACM International Symposium on Microarchitecture, pp. 469–480. ACM, New York (2009)

    Google Scholar 

  10. Li, S., Liu, D., Liu, W.: Optimized data reuse via reordering for sparse matrix-vector multiplication on FPGAs. In: IEEE/ACM International Conference on Computer Aided Design (ICCAD), Munich, Germany, pp. 1–9. IEEE (2021)

    Google Scholar 

  11. Lowe-Power, J., et al.: The gem5 Simulator: V20.0+. arXiv:2007.03152 (2020)

  12. Mantovani, F., et al.: Software Development Vehicles to enable extended and early co-design: a RISC-V and HPC case of study (2023). arXiv:2306.01797

  13. Marques, D., et al.: Performance analysis with cache-aware roofline model in intel advisor. In: 2017 International Conference on High Performance Computing & Simulation (HPCS), pp. 898–907 (2017)

    Google Scholar 

  14. Rodrigues, A., Ilic, A., Sousa, L.: Performance modelling-driven optimization of RISC-V hardware for efficient SpMV. In: Bienz, A., Weiland, M., Baboulin, M., Kruse, C. (eds.) High Performance Computing. ISC High Performance 2023. LNCS, vol. 13999, pp. 486–499. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40843-4_36

    Chapter  Google Scholar 

  15. Sato, M., et al.: Co-design for A64FX manycore processor and “Fugaku”. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, Atlanta, GA, USA, pp. 1–15. IEEE (2020)

    Google Scholar 

  16. Shalf, J.: The future of computing beyond Moore’s Law. Philos. Trans. Royal Soc. A Math. Phys. Eng. Sci. 378(2166), 20190061 (2020)

    MathSciNet  Google Scholar 

  17. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for floating-point programs and multicore architectures. Technical report, 1407078 (2009)

    Google Scholar 

  18. Zhao, J., et al.: SonicBOOM: The 3rd Generation Berkeley Out-of-Order Machine, p. 7 (2020)

    Google Scholar 

Download references

Acknowledgements

This project has received funding from the European High Performance Computing Joint Undertaking (JU) under Framework Partnership Agreement No 800928 and Specific Grant Agreement No 101036168 (EPI SGA2), Grant Agreement No 956213 (SparCity) and Grant Agreement No 101092877 (SYCLOPS). It also received funding from FCT (Fundação para a Ciência e a Tecnologia, Portugal), through the UIDB/50021/2020 project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandre Rodrigues .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodrigues, A., Sousa, L., Ilic, A. (2024). A Performance Modelling-Driven Approach to Hardware Resource Scaling. In: Zeinalipour, D., et al. Euro-Par 2023: Parallel Processing Workshops. Euro-Par 2023. Lecture Notes in Computer Science, vol 14352. Springer, Cham. https://doi.org/10.1007/978-3-031-48803-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48803-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48802-3

  • Online ISBN: 978-3-031-48803-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics