plotly.py icon indicating copy to clipboard operation
plotly.py copied to clipboard

Explore using Narwhals in Plotly Express

Open LiamConnors opened this issue 1 year ago • 5 comments

Narwhals is a compatibility layer between Polars, pandas, and other dataframes. https://narwhals-dev.github.io/narwhals/

This issue is to explore what changes we would need to make to use Narwhals in Plotly Express.

LiamConnors avatar Sep 03 '24 18:09 LiamConnors

I'm glad that this is being explored! Especially since pandas 3.0 could still add a hard dependency on pyarrow (seems to be their current plan unless the new PDEP delaying the dependency is approved), and since both pyarrow and pandas seem to be required required now for polars to work with Plotly Express.

Not sure if pointing this out here is kosher, and you may already be aware of this, but I understand that this is the (main) PR where altair switched to using narwhals, just for reference: https://github.com/vega/altair/pull/3452. If it's not kosher, feel free to edit this out.

firai avatar Sep 11 '24 12:09 firai

Hey there! Thanks for considering Narwhals as an option to make the Plotly Express module more dataframe agnostic. I started to take a look at how that would look like, and I believe that there are a couple of things worth mentioning:

  • narwhals still misses a few features that would make the integration seamless (e.g. DataFrame.unpivot and DataFrame.cast methods, both are work in progress)
  • the case in which a trendline is requested, we can only go so far, at a certain point statsmodels is used and we need to pass something which statsmodels supports (i.e. pandas or numpy), so we will need to trigger a conversion for the user. I am saying that just to bring awareness of the fact.

Time permitting, when those WIP features will be merged and released, I will take a closer look again.

Edit: For progress updates 😁 branch I am working on

FBruzzesi avatar Sep 24 '24 19:09 FBruzzesi

Hey Francesco! It looks like you've made great progress so far. Is there anything the Plotly team can do to support what you're working on? We're very interested in taking this feature further. For now we're working on typed array support in https://github.com/plotly/plotly.py/pull/4470, and I can imagine that Narwhals support could take this integration even further. Cheers!

ndrezn avatar Oct 07 '24 14:10 ndrezn

Yes the PR is almost ready, I am able to run the entire test suite successfully with polars and pyarrow on a narwhals branch (with features from https://github.com/narwhals-dev/narwhals/pull/1145). Other required feature are also in main but not released just yet.

As soon as we make a new release I should be able to open the PR. I would expect to be of a similar size of #4470 in terms on line changes. I was wondering if there is a good approach to make it easier for review other than commenting it in great details.

Let me cc @MarcoGorelli as well into the thread 😁

FBruzzesi avatar Oct 07 '24 14:10 FBruzzesi

Opening a PR as draft very soon, I am a bit unsure where to set the test dependencies and how CI is run 🙈

FBruzzesi avatar Oct 09 '24 11:10 FBruzzesi