TST: Add pytest-codspeed and benchmarking suite
Would be great for testing performance optimizations.
Do you mean adding tests with pytest-benchmark? I'm tinkering with something similar over in optx - specifically I'd like to support benchmarking our solvers on optimisation problems. We could probably have a unified approach to benchmarking compile times too.
Personally I landed on this due to the familiar format, and the option for performance comparison across versions.
Apologies if this is too far off-topic :)
Re pytest-benchmark, also good!
Though pytest-codspeed is essentially a drop-in replacement for pytest-benchmark and hook ups to the nice https://codspeed.io service (free for open source).
I like this idea as well, I've thought about suggesting speed regression tests to diffrax before (since that's come up in my work and others in the issues), but unifying things to check for speed in general sounds like a good idea
This sounds reasonable to me. IIUC codspeed is a service for recording the values of benchmarks, and is otherwise exactly the same as pytest-benchmark? If so, this all sounds reasonable to me.
In general, pytest-codspeed is a drop-in replacement.
It does lack a few features supported by pytest-benchmark, but I haven't really run into them. Instead it offers a more stable benchmark and nice visualization / GH hooks / etc. They are also adding other cool features I haven't yet taken advantage of.
The most challenging thing for benchmarking JAX is separately testing the jit-compile vs jit-eval. See https://github.com/GalacticDynamics/unxt/blob/v1.4.0/tests/benchmark/test_quaxed.py for one possible approach.
Yup, using AOT compilation as a measure of compilation time makes sense to me, though it is not necessarily the same as JIT compilation time (?).
With codspeed, can custom metrics be added to the reported/saved results? To benchmark solvers, this would be useful - and could include things such as the number of steps taken, as well as how far off we are of the expected result.
With codspeed, can custom metrics be added to the reported/saved results? To benchmark solvers, this would be useful - and could include things such as the number of steps taken, as well as how far off we are of the expected result.
Is this maybe what you are thinking of? There's this pattern:
def test_mean_and_median_performance(benchmark):
# Precompute some data useful for the benchmark but that should not be
# included in the benchmark time
data = [1, 2, 3, 4, 5]
# Benchmark the execution of the function:
# The `@benchmark` decorator will automatically call the function and
# measure its execution
@benchmark
def bench():
mean(data)
median(data)
So after benchmarking you can test correctness. However I'm not sure that's better than separation of concerns: different tests for different things.
I mean this feature: https://pytest-benchmark.readthedocs.io/en/latest/usage.html#extra-info
So the function being benchmarked can return something (such as a diffrax or optimistix Solution), fields of which we might like to save.
Oh, that I'm not sure about in pytest-codspeed.
They can probably coexist happily, I doubt we'd want to run a large benchmarking suite in CI anyway! But some speed-tests on a small set of representative problems would be great to include.