msgspec
msgspec copied to clipboard
msgspec.structs.fields() is 20x slower than dataclasses.fields()
Description
Hello, I noticed in a project where I have to frequently call msgspec.structs.fields that this method is compared to dataclasses extremely slow (about 20x slower).
from dataclasses import dataclass, fields
from timeit import timeit
import msgspec
msgspec_fields = msgspec.structs.fields
@dataclass
class D:
x: int
y: int | None = None
z: str | None = None
class M(msgspec.Struct):
x: int
y: int | None = None
z: str | None = None
d = D(1)
m = M(1)
print("dataclass.fields = ", end="")
print(timeit(lambda: fields(d), number=1000000))
print("msgspec.structs.fields = ", end="")
print(timeit(lambda: msgspec_fields(m), number=1000000))
Output:
dataclass.fields = 0.3529520556330681
msgspec.structs.fields = 7.523276620544493
Tested with python 3.13.7 and msgspec 0.19.0 and 0.20.0 (same result).
Maybe simply caching the data in a classvar is a fast solution here?
Huh, interesting. I suspect the issue is twofold:
- We're rebuilding this info on every invocation of
fields - We, very expensively, inspect the type annotations during that (https://github.com/jcrist/msgspec/blob/a9ed8f12f11269704aa25680a2273287485c9f5a/src/msgspec/structs.py#L91)
I think it should be possible to speed this up by a lot, as we should have all the information required for this available already within the internal struct state.