datatree
datatree copied to clipboard
Merge Datatree siblings when they are compatible
Hey,
I've come across datatree only recently, but it I already see many use-cases in my work, so thanks for the effort!
A minimal example of what I'd like to have:
- I have data from two models (say 'us' and 'eu'), and they have different
x
andy
coordinates → therefore I store them as siblings in a datatreedt
(structure: "forcasts/us", "forcasts/eu") instead of a single Dataset - After some manipulation, both Datatrees are actually compatible, e.g., because I averaged over
x
andy
- for further analyses, I'd like to concat the two Datatrees in a single Dataset along the dimension model, something like
xr.concat([dt["us"].ds, dt["eu"].ds], dim="model").assign_coords(model=["us", "eu"])
(for example, I could then usedt['forecasts'].ds.t2m.plot(hue='model')
) - it would be nice to allow such an operation, e.g., via
dt.concat_leaves()
, where the result is a datatree "forecasts"
A very similar use case would be:
- I have forecast runs where the resolution changes after day 15 of the integration
- I could store them as a datatree ("forecast/short-range", "forecast/medium-range")
- after doing some manipulation, e.g., spatial averaging, both forecast ranges could be compatible, and one sibling stores leadtime days 0-14 and one stores days 15-46
- it would be cool to have again something like
dt.merge_leaves()
to have a new dataset with a continuous leadtime
If something like that is already possible I apologize for my ignorance.
Again, thanks for putting this together.
Cheers, Jonas
see #192 for some discussion on collapsing subtrees in general
Closed in favor of #192