The purpose of Benchto project is to provide easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behaviour of distributed systems is hard and requires good visibility into state of the cluster and internals of tested system. This project was developed for repeatable benchmarking of Hadoop SQL engines, most importantly Trino.
Benchto consists of two main components: benchto-service
and benchto-driver
. To utilize all features of Benchto,
it is also recommended to configure Graphite and Grafana. Image below depicts high level architecture:
-
benchto-service - persistent data store for benchmark results. It exposes REST API and stores results in relational DB (Postgres). Driver component calls API to store benchmark execution details which are later displayed by webapp which is bundled with service. Single instance of benchto-service can be shared by multiple benchmark drivers.
-
benchto-driver - standalone java application which loads definitions of benchmarks (benchmark descriptors) and executes them against tested distributed system. If cluster monitoring is in place, driver collects various metrics (cpu, memory, network usage) regarding cluster resource utilization. Moreover it adds graphite events which can be later displayed as Grafana annotations. All data is stored in service for later analysis.
-
monitoring - cluster monitoring is optional, but highly recommended to fully understand performance characteristics of tested system. It is assumed that Graphite/Carbon is used as metrics store and Grafana for clusters dashboards. There is no limitation on metric agents deployed on cluster hosts.
It is possible to profile every query execution with:
- Java Flight Recorder
- async profiler
- perf (linux)
To use Java Flight Recorder one should add following configuration:
benchmark:
feature:
profiler:
profiled-coordinator: # pod name of coordinator
enabled: true
jfr:
enabled: true
output-path: /tmp # path where jfr recording files will be saved
jmx.port: ${jmx.port} # JMX port of profiled JVM
To use async profiler one should add following configuration:
benchmark:
feature:
profiler:
profiled-coordinator: # pod name of coordinator
enabled: true
async:
enabled: true
output-path: /tmp # path where jfr recording files will be saved
jmx.port: ${jmx.port} # JMX port of profiled JVM
async-library-path: # path to libasyncProfiler shared library
events: # list of async events like wall, cpu, lock, alloc and so on
- cpu
To use async profiler one should add following configuration:
benchmark:
feature:
profiler:
profiled-coordinator: # pod name of coordinator
enabled: true
async:
enabled: true
output-path: /tmp # path where jfr recording files will be saved
jmx.port: ${jmx.port} # JMX port of profiled JVM
async-library-path: # path to libasyncProfiler shared library
events:
- cpu