Benchmarks: Aggregate results

Open lwwmanning opened this issue 8 months ago • 1 comments

For each suite, we should do something akin to Clickbench, which computes a shifted ratio for each query:

(new_value + 10ms) / (baseline_value + 10ms)

Notably, the baseline value in Clickbench is the fastest time for that single query across all engines; we'd just want it to be the value from develop as baseline_value with PR value as new_value.

We can then aggregate each engine-suite pair (e.g., TPC-H on NVMe with duckdb) in the same way as Clickbench, which takes the geometric mean of those ratios.

Apr 24 '25 14:04 lwwmanning

Alternatively (and much easier since GH does the diffing vs baseline), we could also just take the geometric mean of query runtimes per engine-suite pair.

Apr 24 '25 14:04 lwwmanning

We have added that recently

Aug 11 '25 10:08 robert3005