[ty] Speedup ty-walltime benchmarks

Open MichaReiser opened this issue 3 weeks ago • 3 comments

Summary

Reduce the time-to-completion for walltime benchmarks from ~15min to ~9min by (should be even faster once the rust caching kicks in):

Increase the sharding from 2 to 4 jobs
Manual selection of the tests per shard based on their walltime
Use a depot runner to build the benchmarks to reduce our codspeed cost and for faster queue and build times (from ~6min to 3min when using a 4-core machine). We could use a larger runner, but I don't think this is necessary, once the third-party dependencies are cached.

I also had to rename the benchmarks because codspeed seems to struggle if benchmarks from different groups run on different shards. But I think that's for the better anyway:

Remove the small, medium and large groups because projects that used to be very fast to type check now take longer and would have to be moved into another group so that we can update the iteration counts.

This PR also updates the iteration counts for colour_science and pandas (from 3 to 2, equal to moving them to large) and pydantic (from 1 to 3, moving it from large to medium).

The downside of this is that we lose our historical data but this is a better long-term setup.

Benchmark	Time/iter	Iters	Total	Shard
colour_science	1.46 min	2	~2.9 min	1
pandas	1.04 min	2	~2.1 min	2
tanjun	2.5s	6	~15s	2
altair	5.1s	6	~31s	2
static_frame	20s	3	~1 min	3
sympy	51s	2	~1.7 min	3
pydantic	10.6s	6	~64s	4
multithreaded	1.4s	24	~34s	4
freqtrade	8s	6	~48s	4

Shard	Benchmarks	Total Time
1	`colour_science`	~2.9 min
2	`pandas\|tanjun\|altair`	~2.9 min
3	`static_frame\|sympy`	~2.7 min
4	`pydantic\|multithreaded\|freqtrade`	~2.4 min

TLDR: The benchmarks now often complete before the ty-instrumented benchmarks

Dec 21 '25 10:12 MichaReiser