benchmark
benchmark copied to clipboard
Report other benchmark results relative to a baseline
Imagine the following scenario: I've build new and fancy vector<T>
and want to benchmark it. I could now write the benchmarks and get some numbers, but I would have absolutely no idea how good these numbers are. For simplicity, let's assume my benchmark tests push_back
, and it tests pushing an integer 1M times. My benchmark runs in 15ms. That's just an absolute number, it doesn't give me slightest idea how this fares against std::vector<T>
(for example).
Therefore I'd love to see an equivalent to BENCHMARK
, called BENCHMARK_RELATIVE
. I would recreate the same benchmark for std::vector<T>
, and use it as a baseline.
template<class Vec>
static void BM_PushBack(benchmark::State &s)
{
Vec v;
v.reserve(1'000'000);
for (auto _ : s) {
v.clear();
for (int i = 0; i < 1'000'000; ++i)
v.push_back(i);
}
}
using BM_PushBackStd = BM_PushBack<std::vector<int> >;
using BM_PushBackCustom = BM_PushBack<custom::vector<int> >;
BENCHMARK(BM_PushBackStd);
BENCHMARK_RELATIVE(BM_PushBackStd, BM_PushBackCustom);
The resulting output would somewhat like this:
2018-08-22 10:46:25
Run on (8 X 4000 MHz CPU s)
CPU Caches:
L1 Data 32K (x4)
L1 Instruction 32K (x4)
L2 Unified 262K (x4)
L3 Unified 8388K (x1)
----------------------------------------------------------------
Benchmark Relative Time CPU Iterations
----------------------------------------------------------------
BM_PushBackStd 30 ms 30 ms 41
BM_PushBackCustom 196.23% 15 ms 15 ms 22
What relative means is up to discussion, in this case I've defined 100%
as equal in relative speed, everything below 100%
is slower than baseline, and everything above is faster. 196%
means it's 1.69 times or 96% faster as baseline.
And credit where credit is due, this idea originally comes from folly's benchmark.h
: https://github.com/facebook/folly/blob/master/folly/docs/Benchmark.md
It can already be done via tools/compare.py filters ./a.put BM_PushBackStd BM_PushBackCustom
Sweet. Am I missing something or can I not compare multiple benchmarks to a single baseline?
Am I missing something or can I not compare multiple benchmarks to a single baseline?
In a single go - correct.
To be honest I have a forked version that adds exactly that functionality (plus the possibility of comparing multiple benchmarks on a single baseline). @LebedevRI Would you suggest to try to make the pull request anyways?
I would say this should wait for 'proper' json support (https://github.com/google/benchmark/pull/499, v2 branch), and then be implemented ontop of that, in on caller's side.
v2 is some way away so i would say go for it with the comparing multiple benchmarks enhancement for the tooling.