VirtualiZarr
VirtualiZarr copied to clipboard
[C]Worthy OAE dataset example
- [x] Closes #132
- [ ] Changes are documented in
docs/releases.rst
A list of everything about this demo that is janky and should be fixed before merging, in order of most to least janky:
- [ ] Functions fail intermittently for unknown reasons, either with a HTTP error (which makes some sense) or saying they ran out of memory, which makes no sense
- [ ] Adding per-task retries might fix this (https://github.com/zarr-developers/VirtualiZarr/pull/575), however Lithops retries don't work properly, see https://github.com/lithops-cloud/lithops/issues/1429
- [x] https://github.com/zarr-developers/VirtualiZarr/issues/574
- [ ] Needs option to cache full file to be merged #564, see #625
- [ ] Even then still relies on the unreleased
developbranch of VirtualiZarr - [ ] Can't glob for filepaths in bucket, so we need https://github.com/zarr-developers/VirtualiZarr/issues/569
- [ ]
combine_by_coordsdidn't work because it triggered a reindex, and I don't know why - [x] https://github.com/zarr-developers/numcodecs/issues/744
- [ ] ManifestStore can't load scalars #530
- [ ] Have to rename paths to non-http URLs, because @maxrjones 's
cachePR generates http URLs, but Icechunk can't store them yet (https://github.com/earth-mover/icechunk/issues/526) - EDIT: for cases that don't require auth this should now work, but is untested - [ ] I had to add the
--provenancekwarg to my local docker build and I don't actually know if I need that - [ ] I have to manually paste my lithops credentials into the
.lithops_configbecause they seem not to be discovered when set as environment variables in the notebook - [x] The
open_virtual_mfdatasetfunction is not documented, but was added in https://github.com/zarr-developers/VirtualiZarr/pull/349 (docs added in #590) - [x] ~~
open_virtual_mfdatasetparallelkwarg is different to theparallelkwarg forxr.open_mfdataset, because the generalization here should be merged upstream https://github.com/pydata/xarray/pull/9932~~ (EDIT: This one isn't important)
I have workarounds for basically all of them, but they should all be understood and fixed.