attrs
attrs copied to clipboard
Performance Improvement for evolve function
Hello,
I have been recently working with the evolve function in attrs and noticed some performance issues which I believe could be improved.
Here is an example I tried:
@frozen
class A:
a: int
b: str
a = A(1, "1")
print(timeit.timeit("evolve(a, a=2)", globals=globals())) # 0.5601 seconds
print(timeit.timeit("A(2, a.b)", globals=globals())) # 0.2004 seconds
The evolve function in this case appears to be almost three times slower than creating a new instance of the class manually.
Upon investigation, I discovered that a significant amount of time was spent on creating class Fields, iterating them, and updating unchanged values.
Additionally, I noticed that creating instances using kwargs took approximately 30% more time than using args:
print(timeit.timeit("A(a=2, b=a.b)", globals=globals())) # 0.2635 seconds
Given these findings, I suggest that we could improve the performance of the evolve function by generating per class functions the first time that the class is evolved. These functions would look something like this:
def evolve_A(inst, changes):
return cls(
changes.get("a", inst.a),
changes.get("b", inst.b)
)
I am open to creating a PR that would implement these changes if you think this is a good idea. I look forward to your thoughts on this.
Thank you.
It would look something like this:
class EvolveRegistryFunction(Protocol):
def __call__(self, cls: type[T], inst: T, changes: dict[str, Any]) -> T:
...
_EVOLVE_REGISTRY: dict[type[Any], EvolveRegistryFunction] = {}
def generate_evolve_func(cls: type[Any]):
attrs: Sequence[Attribute] = fields(cls)
args = {}
kwargs = {}
for a in attrs:
if not a.init:
continue
attr_name = a.name # To deal with private attributes.
init_name = a.alias
if a.kw_only:
kwargs[init_name] = attr_name
else:
args[init_name] = attr_name
fn_name = f"evolve_{cls.__name__}"
code = [f"def {fn_name}(cls, inst, changes):"] # TODO: add module?
code.append(" return cls(")
for init_name, attr_name in args.items():
code.append(f" changes.get('{init_name}', inst.{attr_name}),")
for init_name, attr_name in kwargs.items():
code.append(f" {init_name}=changes.get('{init_name}', inst.{attr_name}),")
code.append(" )")
script = "\n".join(code)
file_name = _generate_unique_filename(cls, "evolve")
globs = {}
eval(compile(script, file_name, "exec"), globs)
_EVOLVE_REGISTRY[cls] = globs[fn_name]
return globs[fn_name]
def evolve_2(inst, **changes):
cls = inst.__class__
evolve_func = _EVOLVE_REGISTRY.get(cls) or generate_evolve_func(cls)
return evolve_func(cls, inst, changes)```
A note on kwargs performance, this is a general fact about Python, not an attrs-specific performance degradation. Calling any Callable using positional args will be faster than passing the same args as kwargs.
As for the evolve registry, another alternative is to just stick another method on the class as an _evolve or __attrs_evolve__ function.