uarch-bench
uarch-bench copied to clipboard
Implement "delta" measurement
Currently we just measure the absolute time of the code under test like so:
static int64_t time_method(size_t loop_count) {
auto t0 = CLOCK::now();
METHOD(loop_count);
auto t1 = CLOCK::now();
return t1 - t0;
}
The downside of this approach is that it includes the time for one CLOCK::now()
call as well as all the overhead of METHOD(loop_count)
which includes at least a call
and ret
and sometimes a small amount of setup overhead.
A better approach is to time the loop with two different loop_count
and use the difference in time to calculate the performance. This causes the above overheads to cancel out (but the test/jump overhead inside the loop within the benchmark is still present, but this is small or sometimes zero).
Note that two-method delta measurement has been implemented in 7667eacd333d4dcefaa2530f4e1228227c01dfef. This shows the results as a delta between the benchmark method and a "base" method that defaults to the empty benchmark dummy_bench
.
Leaving this open since should still implement loop-count based deltas as described above: this uses the same method twice, but with different loop counts. This has some advantages and also some disadvantages over the base method - the primary one being that the loop based method probably does a better job of getting rid of the per-benchmark overhead like setup, which the dummy_bench
wouldn't do unless you wrote a specific dummy for each test.