Add benchmarks, part 1
Given the increasing importance of NNlib in the ML ecosystem, I believe it's time to add automatic benchmarks. This PR is based on the amazing PkgBenchmark.jl and config from ProximalOperators.jl. This is only partial improvement since benchmark code compares a change with a baseline in master, but current master doesn't have any benchmarks at all, so benchmark job will fail by design. All the following PRs (with or without changes to the benchmarks) should work fine.
To be precise, here's what this PR does:
- adds a set of simple benchmarks which can be triggered from the command line using
julia --project=benchmark benchmark/runbenchmarks.jl - adds GitHub Actions job to run these benchmarks automatically on every PR
What this PR does not:
- doesn't provide finished and verified work - this kind of things can only be tested by merging into master and seeing CI's output; yet I'll try to finalize it reasonably quickly
- doesn't add a comprehensive benchmark set - I think we need to add them step by step
- doesn't post the benchmark judgement to the PR - I'm not sure about the consistency of benchmarks on a shared GitHub environment, but I definitely have it mind
Thanks! Benchmarking is very important. I think we would want to consolidate these in FluxBench.jl (in this org), so that we can reliably track and reproduce our benchmarks. So maybe moving this over there would be better too.
Thanks, I didn't know about FluxBench! What's the intended usage for it? I thought about running the benchmarks on every PR automatically to notify the author about possible performance regressions before it gets merged. Do you have in mind something similar for FluxBench?
can this be triggered only per request, instead of every commit of every PR? I'm thinking to something similar to @nanosoldier for Base.
That's the job of the package, to have things in one place and be called on by the keeper bot
@DhairyaLGandhi can you point @dfdx in the right direction for contributing? I didn't know about FluxBench and keeperbot as well, there are no comments about it and no documentation anywhere. Is keeper bot working already?
@tkf provides a useful tool BenchmarkCI for this very specific purpose. It also supports an optional benchmark on a specific label, e.g., a non-op if run benchmark label is not added. An example of this can be found in https://github.com/JuliaImages/ImageContrastAdjustment.jl/pull/37.
Flux Bot*
BenchmarksCI is definitely a good shout. This is also important to not have to deal with shared workers or different system setups adding noise, bit rather working on dedicated benchmarking machines.
BenchmarkCI looks great! A couple of things I didn't understand from the discussion (perhaps, missing some context):
- what is keeperbot / Flux Bot? I can't see any reference to them
- should we add benchmarks and CI for them in this repo or put everything to FluxBench?
- what is the optimal trigger for the benchmarks?
I guess some of the previous comments answer exactly these questions, but I can't connect the dots.
I think it is fine to add microbenchmarks here until we sort out a more general benchmarking infrastructure.
I can't figure out what the best trigger would be
https://docs.github.com/en/actions/reference/events-that-trigger-workflows
I'd like something similar to what we have in Base julia, i.e. trigger by a comment like
@nanosoldier runbenchmarks(ALL, vs=":master")
if that can be achieved without an external bot
This can help https://github.com/marketplace/actions/pull-request-comment-trigger