benchmark
benchmark copied to clipboard
[FR] `PredictNumItersNeeded()` 1.4 correction factor
Problem description
In the function PredictNumItersNeeded()
there is this 1.4
correction factor.
This causes the time running the experiment to exceed by ~40% the time specified by --benchmark_min_time
.
Of course, --benchmark_min_time
denotes the minimum amount of time to run the benchmark, but an overrun of 40% seems excessive.
This is particularly relevant in supercomputers, where CPU time is expensive.
- Why is this estimation done, instead of stopping the iterations when the accumulated "iteration time" exceeds the target time?
- Is there a reason for selecting 1.4 as a correction factor?
- In cases where the execution times are not stable, could this prediction be wrong by a large margin?
Suggested solution I suggest either removing the correction factor or making it configurable (with a default value of 1.0).
Example
As shown in the following output (executed with --benchmark_min_time=1s
) the real execution time is ~1.4s: $7039 \times 198481 = 1397107759$, $8775 \times 160984 = 1412634600$, ...
------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------
GNU-TBB/std::adjacent_difference/double/1024/manual_time 7039 ns 7022 ns 198481 bytes_per_second=6.50968Gi/s
GNU-TBB/std::adjacent_find/double/1024/manual_time 8775 ns 8702 ns 160984 bytes_per_second=2.61075Gi/s
GNU-TBB/std::all_of/double/1024/manual_time 7704 ns 7529 ns 199585 bytes_per_second=2.97387Gi/s
GNU-TBB/std::any_of/double/1024/manual_time 4707 ns 4625 ns 301878 bytes_per_second=4.86754Gi/s
When executing the same code with --benchmark_min_time=0.71s
($1/1.4 \simeq 0.71$), the execution times are much closer to 1s: $7681 \times 134530 = 1033324930$, $9265 \times 103410 = 958093650$, ...
------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------
GNU-TBB/std::adjacent_difference/double/1024/manual_time 7681 ns 7615 ns 134530 bytes_per_second=5.96565Gi/s
GNU-TBB/std::adjacent_find/double/1024/manual_time 9265 ns 9175 ns 103410 bytes_per_second=2.47288Gi/s
GNU-TBB/std::all_of/double/1024/manual_time 7996 ns 7572 ns 110333 bytes_per_second=2.86519Gi/s
GNU-TBB/std::any_of/double/1024/manual_time 4656 ns 4680 ns 192865 bytes_per_second=4.92025Gi/s