attrs icon indicating copy to clipboard operation
attrs copied to clipboard

Performance Improvement for evolve function

Open LukasKrocek opened this issue 2 years ago • 3 comments

Hello,

I have been recently working with the evolve function in attrs and noticed some performance issues which I believe could be improved.

Here is an example I tried:

@frozen
class A:
    a: int
    b: str

a = A(1, "1")

print(timeit.timeit("evolve(a, a=2)", globals=globals()))  # 0.5601 seconds
print(timeit.timeit("A(2, a.b)", globals=globals()))  # 0.2004 seconds

The evolve function in this case appears to be almost three times slower than creating a new instance of the class manually.

Upon investigation, I discovered that a significant amount of time was spent on creating class Fields, iterating them, and updating unchanged values.

Additionally, I noticed that creating instances using kwargs took approximately 30% more time than using args:

print(timeit.timeit("A(a=2, b=a.b)", globals=globals())) # 0.2635 seconds

Given these findings, I suggest that we could improve the performance of the evolve function by generating per class functions the first time that the class is evolved. These functions would look something like this:

def evolve_A(inst, changes):
    return cls(
            changes.get("a", inst.a),
            changes.get("b", inst.b)
        )

I am open to creating a PR that would implement these changes if you think this is a good idea. I look forward to your thoughts on this.

Thank you.

LukasKrocek avatar Jun 09 '23 13:06 LukasKrocek

It would look something like this:

class EvolveRegistryFunction(Protocol):
    def __call__(self, cls: type[T], inst: T, changes: dict[str, Any]) -> T:
        ...


_EVOLVE_REGISTRY: dict[type[Any], EvolveRegistryFunction] = {}


def generate_evolve_func(cls: type[Any]):
    attrs: Sequence[Attribute] = fields(cls)
    args = {}
    kwargs = {}
    for a in attrs:
        if not a.init:
            continue
        attr_name = a.name  # To deal with private attributes.
        init_name = a.alias

        if a.kw_only:
            kwargs[init_name] = attr_name
        else:
            args[init_name] = attr_name

    fn_name = f"evolve_{cls.__name__}"
    code = [f"def {fn_name}(cls, inst, changes):"]  # TODO: add module?
    code.append("    return cls(")

    for init_name, attr_name in args.items():
        code.append(f"    changes.get('{init_name}', inst.{attr_name}),")

    for init_name, attr_name in kwargs.items():
        code.append(f"    {init_name}=changes.get('{init_name}', inst.{attr_name}),")

    code.append("    )")
    script = "\n".join(code)
    file_name = _generate_unique_filename(cls, "evolve")
    globs = {}
    eval(compile(script, file_name, "exec"), globs)
    _EVOLVE_REGISTRY[cls] = globs[fn_name]
    return globs[fn_name]


def evolve_2(inst, **changes):
    cls = inst.__class__
    evolve_func = _EVOLVE_REGISTRY.get(cls) or generate_evolve_func(cls)
    return evolve_func(cls, inst, changes)```

LukasKrocek avatar Jun 09 '23 21:06 LukasKrocek

A note on kwargs performance, this is a general fact about Python, not an attrs-specific performance degradation. Calling any Callable using positional args will be faster than passing the same args as kwargs.

As for the evolve registry, another alternative is to just stick another method on the class as an _evolve or __attrs_evolve__ function.

jamesmurphy-mc avatar Jul 02 '23 18:07 jamesmurphy-mc