BenchmarkTools.jl icon indicating copy to clipboard operation
BenchmarkTools.jl copied to clipboard

Feature Request: `@benchmark f() g()`

Open LilithHafner opened this issue 4 years ago • 7 comments

viraltux commented this:

Most of the time the reason we use BenchmarkTools is not because we want to know how fast is A but rather if A is faster than B and by how much. A very good addition in my opinion to BenchmarkTools would be a macro to compare A vs B vs… X instead us guessing if one is faster than the others based on their statistics. This macro would also allow for internal bias reduction (reloading A and B and…, etc.) and running such macro for a long time should account as well for the whole machine/OS potential bias.

And I concur. In benchmarking and optimizing a function, I often define function_old() and function_new() and check if changes to function_new() have the runtime impact I expect. In a benchmarking package, ideally I can perform that comparison correctly, easily, quickly, and precisely. A well crafted varargs @benchmark that supports @benchmark function_old() function_new() would be ideal.

This extension has the additional potential to help users like me avoid common benchmark comparison pitfalls like those discussed in the linked discourse thread

LilithHafner avatar Jul 27 '21 20:07 LilithHafner

Take a look at https://juliaci.github.io/BenchmarkTools.jl/dev/manual/#Handling-benchmark-results especially judge

vchuravy avatar Jul 27 '21 21:07 vchuravy

Perhaps this workflow is common enough to let @benchmark f() g() expand to judge(minimum(@benchmark f()), minimum(@benchmark g()))?

Additionally, is there some way to take advantage of knowing the primary goal of a benchmark is to compare 2 functions by, for example, randomly alternating samples or blocks of samples?

LilithHafner avatar Jul 27 '21 22:07 LilithHafner

I rather not overcomplicate the @benchmark interface.

Additionally, is there some way to take advantage of knowing the primary goal of a benchmark is to compare 2 functions by, for example, randomly alternating samples or blocks of samples?

Hm not currently, I don't know if that would help or hurt. The branch predictor would learn that pattern.

vchuravy avatar Jul 27 '21 22:07 vchuravy

Perhaps a documentation solution then? When I opened this issue I had already loosely read/closely skimmed the (mercifully short!) manual cover to cover, but found the benchmarkgroups and judge sections a bit intimidating and didn't put together judge(minimum(@benchmark f()), minimum(@benchmark g())) as a supported solution to my problem. Nevertheless, I got along fine with things like (@elapsed f())/(@elapsed g()) until I ran into the tangentially related issue that started this thread.

LilithHafner avatar Jul 27 '21 22:07 LilithHafner

Improving the docs would be fantastic! If you have the time maybe you can take a stab at it?

vchuravy avatar Jul 28 '21 06:07 vchuravy

Maybe there's some inspiration we can take from cargo bench and how it keeps a record of previous benchmarks? I don't know how they take care of modalities when plotting, but as far as I know there are some web pages generated for displaying plots of different runs next to each other.

Seelengrab avatar Jul 28 '21 10:07 Seelengrab