altair
altair copied to clipboard
Remove PyArrow dependency for Polars support
What is your suggestion?
Currently, PyArrow is required by Altair for Polars support. I think it shouldn't be too hard to remove it, given that Polars implements the dataframe interchange protocol natively (without depending on PyArrow)
If #3384 can make it in, then Altair would actually support plotting Polars dataframe natively without any extra heavy dependencies. That'd be...pretty amazing? I'd suggest using Altair for polars.DataFrame.plot
if that was the case
I think what would need doing is:
- don't require
pyarrow
to be installed for thedfi = data.__dataframe__
part - instead of using
sanitize_arrow_table
, for Polars, just select date/datetime columns and call.dt.to_string()
- instead of using
to_pylist
from PyArrow, just useDataFrame.rows(named=True)
for Polars - for categoricals, find a non-pyarrow workaround for Polars in
infer_vegalite_type_for_dfi_column
. I haven't tried this yet, but it looks straightforward-ish
Would you open to considering this? Happy to work on a PR if so
Have you considered any alternative solutions?
Just keep the status-quo :) But, I think Altair is the only plotting library that gets close to native Polars support without extra large dependencies, and it doesn't look like a large stretch to go all the way there, so I'm hoping we can do it 💪
Demo from having tried this locally: