Catch2 icon indicating copy to clipboard operation
Catch2 copied to clipboard

Access to performance within a benchmark

Open oliverbunting opened this issue 5 years ago • 14 comments

Description I'd really like to be able to easily check benchmark times as part of CI, to catch unwanted performance regression.

Within the context of BENCHMARK, i'd like 2 features:

  1. The ability to assert meter time below some value for each iteration. THis would enable me to ensure that a latency has never been exceeded.

  2. The ability to assert the mean is below some value

If these can be achieved with existing machinery, then I'd like them clearly added to the docs, because its none obvious how to achieve these aims.

oliverbunting avatar Jan 24 '20 13:01 oliverbunting

There's quite a bit of info in a benchmark's output. Do you absolutely need to be able to assert on iteration time and mean time inside the test case? If that's not needed, parsing the output XML and doing some statistical analysis as part of your CI pipeline could achieve what you want.

Tridacnid avatar May 25 '20 01:05 Tridacnid

I need this feature as well, as i want to calculate throughput as part of the benchmark. Its currently impossible to access the BenchmarkStats without a custom Reporter. I see no reason why it shouldn't be accessible through Catch::getResultCapture() as it is e.g possible with getResultCapture()->getLastResult() for test results.

I don't have a CI setup and I don't plan on doing so just to calculate throughput. Accessing measurement results should be part of the public api of any benchmarking tool.

For throughput calculations in particular there is also https://github.com/catchorg/Catch2/issues/1839 . While certainly a common task, i'm not sure whether this should become a feature. A more generic approach which can add arbitrary user-defined metrics would be better IMO (yet arguably more expensive to implement)


On an unrelated sidenote: Is there any justification for BENCHMARK_ADVANCED? Setup code can already conveniently provided through the scope of the surrounding TEST_CASE or SECTION or code block.


Also I believe the minimum sample time should be reported. When doing micro benchmarks, jitter is always strictly positive and doesn't have a zero mean. There are things which slow down your code (e.g reduced cpu frequency, scheduling, heavy workload etc.) but it can never exceed its optimal execution speed (zero jitter). The minimum of measurements approximates this upper bound.

sehoffmann avatar May 27 '20 13:05 sehoffmann

Conceptually the calculated statistics have to be saved at this place: https://github.com/catchorg/Catch2/blob/b1b5cb812277f367387844aab46eb2d3b15d03cd/include/internal/benchmark/catch_benchmark.hpp#L82 Either implicitly in the following call to benchmarkEnded() or explicitely.

Alternatively one can also consider "returning" the results from the BENCHMARKmacro which conceptually would look like:

auto results = BENCHMARK("My Benchmark") {
    // ...
};

However I believe this to be impossible with the current syntax / implementation via assignment operator.

sehoffmann avatar May 27 '20 13:05 sehoffmann

I think I need this feature too. In A/B benchmark testing, I want the test to require that B is faster than A. If we chose to use B instead of A, based on the results of the benchmark testing, then the test should fail if B is no longer faster than A. I'd be satisfied if I could access the mean value for each case in a REQUIRE call.

We are using Catch v2.13.8.

ScottHutchinson avatar Feb 22 '22 16:02 ScottHutchinson

Is there any update on this feature request? Has anyone figured how to do it?

Epixu avatar Jul 17 '23 12:07 Epixu

We stopped using Catch benchmark tests, because we could find no way to output meaningful results in CI builds. Sadly we are writing our own custom benchmark code instead.

ScottHutchinson avatar Jul 17 '23 14:07 ScottHutchinson

This would be indeed a nice feature

DStrelak avatar Aug 29 '23 07:08 DStrelak

I might do some experiments soon, I'll keep you posted. My plan is to expose and cache benchmark results in an easily parse-able file everytime they're executed, indexed by the test name. One can check for REQUIRE_DEVIATION(<= 0.1f) inside the benchmark code, that will test against all previous runs, failing the test in case of regress. Something along the lines:


auto firstBench = BENCHMARK_ADVANCED("benchmark one") (timer meter) {
   some<uninitialized<stuff>> storage(meter.runs());
   meter.measure([&](int i) {
      return storage[i].construct();
   });
   REQUIRE_DEVIATION(<= 0.1f); // pass test only within 10% deviation from the mean of all previous tests?
};

auto secondBench = BENCHMARK_ADVANCED("benchmark two") (timer meter) {
   some<uninitialized<stuff>> storage(meter.runs());
   meter.measure([&](int i) {
      return storage[i].construct();
   });
   REQUIRE_DEVIATION(<= 0.1); // pass test only within 10% deviation from the mean of all previous tests?
};

REQUIRE(firstBench > secondBench ); // require first test be slower than second
REQUIRE(firstBench.Deviation(secondBench) <= 0.1); // require deviations within 10%
         

Any suggestions are welcome Also, don't hold your breath, because it is not my top priority atm

Epixu avatar Aug 30 '23 15:08 Epixu

I will add my vote to this.

I am considering Catch2 for future use in my YOMM2 library. Since I try hard to make open methods run almost as fast as native virtual functions, being able to write performance tests would be very useful.

jll63 avatar Apr 12 '24 20:04 jll63