Measuring the impact of burst buffers on data-intensive scientific workflows
- Univ. of Southern California, Los Angeles, CA (United States)
Science applications frequently produce and consume large volumes of data, but delivering this data to and from compute resources can be concerning, as parallel file system performance is not keeping up with compute and memory performance. To alter this I/O bottleneck, some systems have deployed burst buffers, but their impact on performance for real-world scientific workflow applications is still not clear. In this paper, we examine the impact of burst buffers through the remote-shared, allocatable burst buffers on the Cori system at NERSC. By running two data-intensive workflows, a high-throughput genome analysis workflow, and a subset of the SCEC high-performance CyberShake workflow, a production seismic hazard analysis workflow, we find that using burst buffers offers read and write improvements of an order of magnitude, and these improvements lead to increased job performance, and thereby increased overall workflow performance, even for long-running CPU-bound jobs.
- Research Organization:
- Univ. of Southern California, Los Angeles, CA (United States); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States); Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States); Rensselaer Polytechnic Inst., Troy, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF)
- Grant/Contract Number:
- SC0012636; AC02-05CH11231; EAR-1033462
- OSTI ID:
- 1603369
- Alternate ID(s):
- OSTI ID: 1568907
- Journal Information:
- Future Generations Computer Systems, Vol. 101, Issue C; Related Information: R. Ferreira da Silva, S. Callaghan, T. M. A. Do, G. Papadimitriou, and E. Deelman, “Measuring the Impact of Burst Buffers on Data-Intensive Scientific Workflows,” Future Generation Computer Systems, vol. 101, p. 208–220, 2019.; ISSN 0167-739X
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Similar Records
Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters
STAR Data Production Workflow on HPC: Lessons Learned & Best Practices