[Enh]: Support Daft DataFrame
We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?
We use Daft as it is a unified engine for Data Analytics, Engineering & ML/AI and fast: https://delta.io/blog/daft-delta-lake-integration/
Performance chart done by Daft team
Please describe the purpose of the new feature or describe the problem to solve.
It would be great to support Daft DataFrame, thanks! βΊοΈ
Upstream Issues
- [x] https://github.com/Eventual-Inc/Daft/issues/4031
- [x] https://github.com/Eventual-Inc/Daft/issues/4032
- [ ] https://github.com/Eventual-Inc/Daft/issues/4033
- [x] https://github.com/Eventual-Inc/Daft/issues/4094
- [ ] https://github.com/Eventual-Inc/Daft/issues/4095
- [ ] https://github.com/Eventual-Inc/Daft/issues/4096
- [ ] https://github.com/Eventual-Inc/Daft/issues/4098
- [ ] https://github.com/Eventual-Inc/Daft/issues/4151
- [x] https://github.com/Eventual-Inc/Daft/issues/4220
Hey @hongbo-miao this is definitely in scope. We were waiting for a big refactor to land. Now that it has, we can start working towards supporting Daft, but I have the suspicion that @MarcoGorelli is already been cooking something recently π
I've ~~got a local branch~~ added (#2223) with a ~~mostly~~ finished CompliantDataFrame (see for other parts #2202, #2119, #2064).
From a quick look at daft.DataFrame it looks like it would be using CompliantLazyFrame.
Luckily, it'll be able to match (#2211) via daft.DataFrame.explain
It'll probably be easier to scope out the work after spec-ing CompliantLazyFrame - but that shouldn't take too long.
The API reference is an interesting read.
It seems like a mix of polars, pyspark, pyarrow but also some Image operations that seem novel
I have the suspicion that @MarcoGorelli is already been cooking something recently π
π indeed, got something cooking, will update when it's in a more complete state
I have the suspicion that @MarcoGorelli is already been cooking something recently π
π indeed, got something cooking, will update when it's in a more complete state
@MarcoGorelli you should've mentioned you've got a narwhals label in their repo!
I've linked all the issues you've sneakily opened π
We're getting closer
@dangotbanned 's work made this so much easier - well done Dan, thanks so much, it's hard to overstate how much impact you've had on this project π
We're getting closer
@dangotbanned 's work made this so much easier - well done Dan, thanks so much, it's hard to overstate how much impact you've had on this project π
Thanks @MarcoGorelli, really means a lot!

Hey folks! Few questions here:
- Would
narwhalsuse thedaftexecutor directly under the hood? Or translate thedaft.DataFrameto an internal representation to perform its operations? I'm mostly interested in how this interfaces withdaftonray - How is the
narwhalstyping system mapped todaft's typing system?daftsupport Tensors (includingnumpyarrays), for example. - Would using
narwhalsremovedaft's ability to attach GPUs to transforms?
Hey @NellyWhads !
The idea is that if you have daft.DataFrame then you can pass that to nw.from_native. Then, any operation you perform using the Narwhals API gets mapped to the daft DataFrame API
Any characteristics of the original object (e.g. what it's connected to, where it runs) should remain unchanged
Hey @MarcoGorelli π
Do you have any rough timeline on this or is there anything one could do to support? :) We'd love to replace Dask with Daft through narwhals, if possible in any way π¬ EDIT: Ah oops, only now saw that there already is a WIP plugin available: https://github.com/MarcoGorelli/narwhals-daft
Hey @jonded94 !
If you want to try it out, you can do
pip install git+https://github.com/MarcoGorelli/narwhals.git@daft
Here's a little demo, showing tpc-h q1 (which does various aggregations and filters): https://www.kaggle.com/code/marcogorelli/daft-via-narwhals?scriptVersionId=243096444
As for something stable, narwhals-daft is indeed what you're after. We've just finished getting the plugin mechanism ready.
I don't want to make any promises as to when it'll be fully tested and ready, but if anyone's interested in funding the effort please contact [email protected] and we can promise it as a deliverable by some arranged date. It'll probably cost less than what you're expecting and will be well worth it
@jonded94 I got a bit carried away, and narwhals-daft is now published and installable!
https://github.com/narwhals-dev/narwhals-daft
pip install narwhals-daft
and the rest should just work: https://www.kaggle.com/code/marcogorelli/daft-via-narwhals?scriptVersionId=275198341
Curious to hear how you get on with it!
And if anyone fancies helping out with the missing methods, here's the tracking issue: https://github.com/narwhals-dev/narwhals-daft/issues/35 π
one day https://github.com/ibis-project/ibis/issues/8904 might also happen