polars
polars copied to clipboard
Add __slots__ to pl.Expr and pl.LazyFrame
Description
Profiled count of created pl.Expr and pl.LazyFrame in my use case (graphQL api on top of polars). Found that one of the heaviest queries creates pl.Expr around 3k times and pl.LazyFrame around 1k times.
Tried to "monkey patch" adding slots on them directly in .venv and found that it seems to be working ok.
Will follow up with PR later.
What will that do?
__slots__ improves the performance of class attribute access, sometimes by quite a bit. I tried playing with this in dataframe/series a while back and ran into issues with namespace registration, since using __slots__ means you can't dynamically add attributes to a class. To get around this, I had to create a separate namespace dict and override __getattr__ to detect namespace retrieval, and in doing so removed the performance benefit of using __slots__.
Well, then registration of namespace would be better handled by using descriptors.
Like this. Will experiment on this once I get to PR.
class Namespace:
def __get__(self, obj, objtype=None):
return "FooNamespace"
class Foo:
__slots__ = ("_bar",)
Foo.namespace = Namespace()
instance = Foo()
print(instance.namespace) # prints FooNamespace
print(Foo.namespace) # prints FooNamespace too
Having __slots__ only means that you can't add attributes to instances, not class itself . Class object would still have __dict__.
NameSpace class that wraps plugins is already is a descriptor. And it seems like it's registered on classes in _create_namespace, and that should not interact with __dict__ of instances in any way.
So, given that it goes to Expr.__dict__ I don't see why you would need to define __getattr__ on Expr.
@mcrumiller Am I missing something here?
@Object905 if it works, go for it! I may have done something incorrect, and this was a while ago so I don't recall the details. Are you talking about adding _expr and _ldf to the __slots__?
Yes. Builtin namespaces, like str, list, etc also good candidates.
@Object905 what do you recommend for adding attributes to instances, given that after this change one can't use setattr?
@knl I was just about to make a feature request for an empty attr dictionary.