intake-esm
intake-esm copied to clipboard
Better support for datatree / kerchunk
I've been using kerchunk to generate aggregated datasets that have a Zarr group for each "stream" (this could be data on different grids and at different frequencies, e.g. full depth grid monthly means, and daily mean surface data).
I've been sticking them as reference files which works well.
I'd like to stick a single entry per simulation in a intake-esm catalog and read with datatree.open_datatree
I think I have two requests:
- turn off aggregation, which seems to be a common request. I'd rather do the aggregation "at write-time" by creating an appropriate JSON file that takes care of various idiosyncrasis (e.g. merging in "static variables") instead of pushing it to the user at read-time.
- a entry in the catalog that switches between using
xr.open_datasetanddatatree.open_datatree. Eventually, there will be axr.open_datatreebut the underlying concept of two different functions to open a group vs a full tree will still be around.
Here's a catalog where there is an entry for each "stream": h,sfc, wci; and a aggregated dataset with stream="combined".
I'd like to pick some simulations and load the combined stream as a datatree
Do you have any thoughts on how to do this?
One step closer with
- #569.
This should enable the following
turn off aggregation, which seems to be a common request. I'd rather do the aggregation "at write-time" by creating an appropriate JSON file that takes care of various idiosyncrasis (e.g. merging in "static variables") instead of pushing it to the user at read-time.
regarding
a entry in the catalog that switches between using xr.open_dataset and datatree.open_datatree. Eventually, there will be a xr.open_datatree but the underlying concept of two different functions to open a group vs a full tree will still be around.
i haven't had a chance to look into possible options. i intend to get back to you next week with some ideas :)