ruff icon indicating copy to clipboard operation
ruff copied to clipboard

[ty] Cache `ClassType::nearest_disjoint_base`

Open zanieb opened this issue 6 days ago • 10 comments

Pulled out of https://github.com/astral-sh/ruff/pull/22052

Locally, this improved performance on Pydantic by 5%, avoiding the regressions in the referenced pull request.

zanieb avatar Dec 18 '25 22:12 zanieb

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

astral-sh-bot[bot] avatar Dec 18 '25 22:12 astral-sh-bot[bot]

mypy_primer results

Changes were detected when running on open source projects
Tanjun (https://github.com/FasterSpeeding/Tanjun)
- tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `_T@cached_inject | Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]]`
+ tanjun/dependencies/data.py:347:12: error[invalid-return-type] Return type does not match returned value: expected `_T@cached_inject`, found `Coroutine[Any, Any, _T@cached_inject | Coroutine[Any, Any, _T@cached_inject]] | _T@cached_inject`

xarray (https://github.com/pydata/xarray)
- xarray/core/dataarray.py:5744:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataarray.py:5744:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`
- xarray/core/dataset.py:8873:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `T_Xarray@map_blocks | DataArray | Dataset`
+ xarray/core/dataset.py:8873:16: error[invalid-return-type] Return type does not match returned value: expected `T_Xarray@map_blocks`, found `DataArray | Dataset`

jax (https://github.com/google/jax)
+ jax/_src/tree_util.py:302:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:305:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
+ jax/_src/tree_util.py:308:31: error[invalid-argument-type] Argument to bound method `register_node` is incorrect: Expected `(Hashable, Iterable[object], /) -> T@register_pytree_node`, found `(_AuxData@register_pytree_node, _Children@register_pytree_node, /) -> T@register_pytree_node`
- Found 2802 diagnostics
+ Found 2805 diagnostics

static-frame (https://github.com/static-frame/static-frame)
- static_frame/core/bus.py:671:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemLocReduces[Bus[Any], object_]`, found `InterGetItemLocReduces[Top[Bus[Any]] | TypeBlocks | Batch | ... omitted 6 union elements, object_]`
+ static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Self@iloc | Bus[Any], object_ | Self@iloc]`
- static_frame/core/bus.py:675:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Bus[Any], object_]`, found `InterGetItemILocReduces[Top[Index[Any]] | TypeBlocks | Top[Bus[Any]] | ... omitted 6 union elements, generic[object]]`
- static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Top[Index[Any]] | TypeBlocks | Top[Bus[Any]] | ... omitted 6 union elements, generic[object]]`
+ static_frame/core/series.py:772:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Series[Any, Any], TVDtype@Series]`, found `InterGetItemILocReduces[Top[Index[Any]] | Top[Series[Any, Any]] | TypeBlocks | ... omitted 6 union elements, generic[object]]`
- static_frame/core/series.py:4072:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[SeriesHE[Any, Any], TVDtype@SeriesHE]`, found `InterGetItemILocReduces[Top[Index[Any]] | TypeBlocks | Top[Bus[Any]] | ... omitted 7 union elements, generic[object]]`
+ static_frame/core/series.py:4072:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[SeriesHE[Any, Any], TVDtype@SeriesHE]`, found `InterGetItemILocReduces[Top[Index[Any]] | Top[Series[Any, Any]] | TypeBlocks | ... omitted 6 union elements, generic[object]]`
- static_frame/core/yarn.py:418:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Yarn[Any], object_]`, found `InterGetItemILocReduces[Top[Index[Any]] | TypeBlocks | Top[Bus[Any]] | ... omitted 6 union elements, generic[object]]`
+ static_frame/core/yarn.py:418:16: error[invalid-return-type] Return type does not match returned value: expected `InterGetItemILocReduces[Yarn[Any], object_]`, found `InterGetItemILocReduces[Top[Index[Any]] | Top[Series[Any, Any]] | Top[Yarn[Any]] | ... omitted 6 union elements, generic[object]]`
- Found 1844 diagnostics
+ Found 1843 diagnostics

pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
+ pandas-stubs/_typing.pyi:1223:16: warning[unused-ignore-comment] Unused blanket `type: ignore` directive
- Found 5086 diagnostics
+ Found 5087 diagnostics

rotki (https://github.com/rotki/rotki)
- rotkehlchen/chain/decoding/tools.py:99:13: error[invalid-argument-type] Argument to function `decode_transfer_direction` is incorrect: Expected `Sequence[A@BaseDecoderTools]`, found `Unknown | tuple[BTCAddress, ...] | tuple[ChecksumAddress, ...] | tuple[SubstrateAddress, ...] | tuple[SolanaAddress, ...]`
+ rotkehlchen/chain/decoding/tools.py:97:13: error[invalid-argument-type] Argument to function `decode_transfer_direction` is incorrect: Expected `(A@BaseDecoderTools & BTCAddress) | (A@BaseDecoderTools & ChecksumAddress) | (A@BaseDecoderTools & SubstrateAddress) | (A@BaseDecoderTools & SolanaAddress)`, found `A@BaseDecoderTools`
+ rotkehlchen/chain/decoding/tools.py:98:13: error[invalid-argument-type] Argument to function `decode_transfer_direction` is incorrect: Expected `(A@BaseDecoderTools & BTCAddress) | (A@BaseDecoderTools & ChecksumAddress) | (A@BaseDecoderTools & SubstrateAddress) | (A@BaseDecoderTools & SolanaAddress) | None`, found `A@BaseDecoderTools | None`
+ rotkehlchen/chain/decoding/tools.py:99:13: error[invalid-argument-type] Argument to function `decode_transfer_direction` is incorrect: Expected `Sequence[(A@BaseDecoderTools & BTCAddress) | (A@BaseDecoderTools & ChecksumAddress) | (A@BaseDecoderTools & SubstrateAddress) | (A@BaseDecoderTools & SolanaAddress)]`, found `Unknown | tuple[BTCAddress, ...] | tuple[ChecksumAddress, ...] | tuple[SubstrateAddress, ...] | tuple[SolanaAddress, ...]`
- Found 2092 diagnostics
+ Found 2094 diagnostics


Memory usage changes were detected when running on open source projects
prefect (https://github.com/PrefectHQ/prefect)
-     memo metadata = ~167MB
+     memo metadata = ~159MB


astral-sh-bot[bot] avatar Dec 18 '25 22:12 astral-sh-bot[bot]

CodSpeed Performance Report

Merging #22065 will improve performance by 4.64%

Comparing zb/cache-nearest (4a5c711) with main (ad41728)

Summary

⚡ 1 improvement
✅ 21 untouched
⏩ 30 skipped[^skipped]

Benchmarks breakdown

Mode Benchmark BASE HEAD Efficiency
Simulation hydra-zen 1.2 s 1.1 s +4.64%

[^skipped]: 30 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

codspeed-hq[bot] avatar Dec 18 '25 22:12 codspeed-hq[bot]

Hm running this again locally (due to the lack of difference on Pydantic in the CodSpeed report), I can't reproduce the performance difference on Pydantic. I'll poke around some more.

zanieb avatar Dec 18 '25 23:12 zanieb

Seems like it could be worth it just for the memory reduction reported by mypy_primer (though I don't fully understand why this would reduce memory usage)

AlexWaygood avatar Dec 18 '25 23:12 AlexWaygood

Before https://github.com/astral-sh/ruff/pull/22044, this was the best commit from https://github.com/astral-sh/ruff/pull/22052 but now the rest of https://github.com/astral-sh/ruff/pull/22052 looks more promising again 🙃

This commit alone looks like ~no difference on pydantic locally (and in CI) now.

zanieb avatar Dec 19 '25 00:12 zanieb

It could make sense to run ty locally with TY_MEMORY_REPORT=full between main/this branch.

One reason why memory usage might be reduced is that this PR deduplicates some salsa metadata because queries calling nearest_disjoint_base now track a single dependency on that query whereas before, they each tracked all the dependencies read in the nearest_disjoint_base function. But it's hard to tell from the aggregated metrics.

MichaReiser avatar Dec 19 '25 09:12 MichaReiser

After rebasing on main the sphinx memory difference went away and now there's a diff on prefect, feeling wary.

zanieb avatar Dec 19 '25 14:12 zanieb

After rebasing on main the sphinx memory difference went away and now there's a diff on prefect, feeling wary.

We round the the memory usage numbers to get more stable results. However, this also means that not all changes show in the memory usage report. You can run ty locally with TY_MEMORY_REPORT=full to get a detailed report without rounded numbers (which I find very useful because it also helps you understand the differences at the query level)

MichaReiser avatar Dec 19 '25 14:12 MichaReiser

I'm doing that, and the diff in sphinx went away :p

I also can't see a diff in Prefect locally.

zanieb avatar Dec 19 '25 14:12 zanieb