ENH: Preserve attrs in to_dataframe()
- [x] Closes #5327
- [x] Tests added
- [x] Passes
pre-commit run --all-files - [x] User visible changes (including notable bug fixes) are documented in
whats-new.rst
Thanks @snowman2
If you're interested, see #3497 for the inverse problem of using pandas attrs when constructing Xarray objects (in a future PR) :)
Was there any reason this stalled? It looked like a good start!
I believe the issue is that pandas.DataFrame does not support column attrs (or did not? I didn't check whether that changed since then). DataFrame-level attrs should work, though.
They were thinking of removing it at one point: https://github.com/pandas-dev/pandas/issues/52166, also https://github.com/dask/dask/issues/11146
perhaps we should punt until someone really really wants it?
Yes, looks like the conclusion from the pandas issue is they want to keep it but the support is spotty.
Probably we close this unless someone comes to save it, but I would vote to merge a PR that did this — I can't see a downside...
hi, I just saw that this discussion has been picking up. I work on the framework mentioned in this comment https://github.com/pandas-dev/pandas/issues/52166#issuecomment-2178473375 and we would be very happy if the dataframe-level attrs would be added back to dask-dataframes. We don't use the column-level attrs but do use the df-level attrs. Currently, the solution we implement to use the latest dask is that we ask users to change the configs like so
dask.config.set({'dataframe.query-planning': False})
Would this PR preserve the attrs also in the dask-expr backend?