[Enh]: Support `Expr.meta.(serialize|deserialize)`
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
This issue was created as a follow-up to a Discord message.
Please describe the purpose of the new feature or describe the problem to solve.
The purpose is to have some way to pass serialized expressions between different processes (they could be on the same machine, or different machines). For example in a client-server case:
- the client serializes a Narwhals expression (for selection, projection, aggregation, etc.)
- the client sends the serialized expression to a server
- the server deserializes the incoming expression and executes it
- the server does something with the execution result
Suggest a solution if possible.
Something like nw.Expr.meta.serialize and nw.Expr.deserialize?
If you have tried alternatives, please describe them below.
polars offers expression serialisation/deserialisation via polars.Expr.meta.serialize and polars.Expr.deserialize, but this requires the communicating processes to have the same version of polars for maximum compatibility.
Additional information that may help us understand your needs.
No response
@3ok thanks for raising this issue!
Narwhals
nw.Expr.meta.(serialize|deserialize) will start to become a possibility with #2571.
I've been making progress over at #2572 - but it's quite a big job 😅
I think most of what you've described should be achievable. However this part of the issue will take much longer:
polarsoffers expression serialisation/deserialisation viapolars.Expr.meta.serializeandpolars.Expr.deserialize, but this requires the communicating processes to have the same version ofpolarsfor maximum compatibility.
If/when #2572 lands, I expect there'll be an initial period of instability while we work on getting things right.
But once we're over that hurdle, narwhals has the added constraint of supporting 11 backends.
So that'll force a certain level of stability on us that polars doesn't have to contend with.
It is still too early to say what level of stability that might be - and I definitely don't want that PR to block the narwhals internals from evolving
Suggestion
Use polars and pin the version in your package/app's dependencies, so it can only be installed with that version.
Dependending on the internals of a project is tricky - but possible with enough effort and reducing the scope to a single version.
I'd love for my WIP to be the solution for you, but polars is available now and I think would be a better fit for you 🙂
Hello!
The goal I'm trying to achieve has a lower scope and is slightly different, but I think it's similar enough to bring it up here.
I would like to be able to use pieces of SQL(-like) - in my case filters - in yaml files and CLI args, and have them parsed into Narwhals expressions. We were able to achieve some success with the help of sqlglot, but that's not finalized yet.
I would also be like to do the opposite operation, but that's not required right now.
Clearly this would be a very useful feature for Narwhals. It will enable so many use cases for partially templated workflows, or for workflows that can make use of user input at launch-time (e.g. having CLIs, config editors).
@MarcoGorelli what are your thoughts on this?
yup, definitely in scope
since how an expression is just stored as a tuple of ExprNodes, it should be fairly straightforward to serialise / deserialise
My comment was not about ser/de to an arbitrary format, but specifically about SQl representation
not sure i understand sorry, could you show a full example of what you'd like to do please?
I'd like to have something like this:
table_a: ~
table_b:
deps: [table_a]
filters:
- "user_age > 18"
Or:
python script.py --filter "user_age > 18"
I would like to have a standard way (or even just a documented example) to parse "user_age > 18" into nw.col("user_age") > 18.
thanks - i think this is a very different request, could you open a new issue pleas?
Created #3310