polars
polars copied to clipboard
Deep equality of pl.Expr trees
Describe your feature request
pl.Expr
s supply a wide array of operator overloads including __eq__
which is great syntactic sugar to construct more complicated nested predicate expressions. However, by losing __eq__
we no longer have a way to recursively check that pl.Expr
s indeed are equal to each other. Semantically:
pl.lit(3) == pl.lit(5)
is a boolean expression. I would like something like a .equals()
:
pl.lit(3).equals(pl.lit(5))
which is a boolean value in python. Here, it would be False
because pl.lit(3)
is an expression node with a value of 3 as its member while pl.lit(5)
is an expression node with a value of 5 as its member. One can imagine more complicated expressions:
(pl.lit(3) + pl.lit(5)).equals(pl.lit(3) + (pl.lit(1) + pl.lit(4)))
Would yield False
although the boolean expression would evaluate to true, since LHS looks like:
add
├─ pl.lit(3)
├─ pl.lit(5)
And RHS looks like:
add
├─ pl.lit(3)
├─ add # This node is different!
├─ pl.lit(1)
├─ pl.lit(4)
This would be really helpful if we want applications to build upon polars, since those applications should have tests to verify correctness of the polars expressions created underneath.
I am thinking about a Expr.meta -> MetaExpr
namespace. This can implement the magic methods on a meta level. E.g. comparing expressions by expression tree. We can also add methods that allow you to modify an existing expression, such as MetaExpr.pop
for popping the latest expression of the tree.
I think this is a good idea. To better understand it, would we be able to convert between Expr
and MetaExpr
via some_expr.meta()
and some_meta_expr.expr()
? I would really like the ability to introspect exprs (especially e.g. Field
s and what column name they have) so this would be a great addition that solves multiple problems :)
A meta
namespace sounds clean; you definitely don't want to mix with root-level Expr->Expr
methods.
I solved the same problem for our in-house data DSL (for which polars is a target/engine) by having a dedicated introspection module (a bit like python inspect
) so the separation was explicit (more like exprs_equal(e1,e2)
than e1.meta.equals(e2)
). The namespace concept is very consistent within polars, so a .meta
makes a lot of sense.
(One nice feature we have is support for custom visitors, allowing for flexible introspection/rewrites/optimisations of arbitrary expression trees; probably a bit harder to offer something like that here, given that the Expr object really lives down in Rust though? :thinking:).