xarray-sql
xarray-sql copied to clipboard
Pyarrow integration idea
Pyarrow has a concept of a Dataset, which is a table partitioned across files that are larger than memory.
https://arrow.apache.org/docs/python/dataset.html
This seems like quite a good fit for this project. I can see two approaches:
- Write a function that returns a PyArrow Dataset from an Xarray, where the table is unraveled.
- Create a PyArrow FileFormat that reads Zarr (or maybe, Xarray?) that does the conversions to tables automatically (https://arrow.apache.org/docs/python/generated/pyarrow.dataset.FileFormat.html)