pytest-benchmark icon indicating copy to clipboard operation
pytest-benchmark copied to clipboard

Quantify difference between compares

Open mnicely opened this issue 5 years ago • 1 comments

Is it possible to quantify improvements between compares? I know with --benchmark-compare-fail=<stat>:<num>% I can see which test performed worse, but I would also like to know how much better my code performs after changes. I tried putting in negative value for %, but receive an error.

P.S. Fantastic tool BTW!

mnicely avatar May 10 '20 13:05 mnicely

Ooof ... well ... --benchmark-compare-fail is not good for that. Comparing benchmarks is a complicated problem and this plugin doesn't have any opinion on how they should be compared - it just gives you stats and tries to give you a clue on how things are different on each stat.

For some use-cases comparing min time is most relevant, for others aggregates like avg/mean or deviations are most useful.

To put this differently ... you should record and compare multiple runs of the same commit to see which stats are most stable. When you know what you can rely on then compare on that specific stat.

Another thing ... hardware is problematic ... you should compare on the same hardware, and ideally with CPU features that have uncertain performance turned off (eg: hyper threading off, turbo boost off or downclocked to have same multiplier for all core loads without hitting power limits). Another useful thing is CPU-pinning.

ionelmc avatar May 11 '20 14:05 ionelmc