ruff icon indicating copy to clipboard operation
ruff copied to clipboard

[ty] Speedup ty-walltime benchmarks

Open MichaReiser opened this issue 3 weeks ago • 3 comments

Summary

Reduce the time-to-completion for walltime benchmarks from ~15min to ~9min by (should be even faster once the rust caching kicks in):

  • Increase the sharding from 2 to 4 jobs
  • Manual selection of the tests per shard based on their walltime
  • Use a depot runner to build the benchmarks to reduce our codspeed cost and for faster queue and build times (from ~6min to 3min when using a 4-core machine). We could use a larger runner, but I don't think this is necessary, once the third-party dependencies are cached.

I also had to rename the benchmarks because codspeed seems to struggle if benchmarks from different groups run on different shards. But I think that's for the better anyway:

Remove the small, medium and large groups because projects that used to be very fast to type check now take longer and would have to be moved into another group so that we can update the iteration counts.

This PR also updates the iteration counts for colour_science and pandas (from 3 to 2, equal to moving them to large) and pydantic (from 1 to 3, moving it from large to medium).

The downside of this is that we lose our historical data but this is a better long-term setup.

Benchmark Time/iter Iters Total Shard
colour_science 1.46 min 2 ~2.9 min 1
pandas 1.04 min 2 ~2.1 min 2
tanjun 2.5s 6 ~15s 2
altair 5.1s 6 ~31s 2
static_frame 20s 3 ~1 min 3
sympy 51s 2 ~1.7 min 3
pydantic 10.6s 6 ~64s 4
multithreaded 1.4s 24 ~34s 4
freqtrade 8s 6 ~48s 4
Shard Benchmarks Total Time
1 colour_science ~2.9 min
2 pandas|tanjun|altair ~2.9 min
3 static_frame|sympy ~2.7 min
4 pydantic|multithreaded|freqtrade ~2.4 min

TLDR: The benchmarks now often complete before the ty-instrumented benchmarks

MichaReiser avatar Dec 21 '25 10:12 MichaReiser