pangeo-forge-recipes
pangeo-forge-recipes copied to clipboard
Implement adapter for `OpenWithKerchunk | OpenWithXarray`
In theory, OpenWithKerchunk
should be able to provide inputs for OpenWithXarray
, which can help address issues such as https://github.com/pangeo-forge/pangeo-forge-recipes/issues/361. IIUC a version of this existed in 0.9.4
(or at least was in development there). Discussion in https://github.com/leap-stc/cmip6-leap-feedstock/issues/16#issuecomment-1694414477 reminded me that this would be a useful thing to implement (or re-implement, as the case may be).
This seems like a great idea!
As far as design. OpenWithKerchunk
returns a PCollection
of references in memory. It seems like there would need to be either:
- An additional
PTransform
to convert thePCollection
of references to aPCollection
offsspec
mappers that could be read byOpenWithXarray
? - An option within
OpenWithKerchunk
that returnsfsspec
mapppers.
Any thoughts here @cisaacstern?
Good questions, @norlandrhagen.
An option within OpenWithKerchunk that returns fsspec mapppers.
I think I'd lean towards this option. The downside this that it introduces multiple return types into OpenWithKerchunk
, but the benefit is it keeps the user-facing API simpler.
another option would be to have OpenWithXarray
use an engine
(xr.open_dataset
backend) that immediately knows what to do with the references (the "kerchunk"
engine discussed in fsspec/kerchunk#360?)
That would be the best way!
@keewis are you working on that PR/issue or do you know if there is any development on it?
I'm not working on this nor am I planning to at the moment (and I'm not aware of anyone else doing so), but the development will most likely happen on the kerchunk
repo.
Thanks for mentioning this @keewis.
Whichever solution we choose here, let's link https://github.com/fsspec/kerchunk/issues/360 in a comment, and mention that the implementation here is a shim until that issue is resolved.