Include data catalogs in other data catalogs
Kind of request
Adding new functionality
Enhancement Description
The ability to include data catalog's (yaml's) via e.g. an include statement in another data catalog (e.g. final.yml). The desired result then would be that the datasets from all the data catalog's are available in HydroMT.
Take e.g. a data catalog (data_catalog1.yml)
my_data:
- meta: meta
And then include it in another data catalog (e.g. final.yml)
include:
- data_catalog1.yml
- data_catalog2.yml
era5_daily:
- stuff: stuff
- more stuff: more stuff
And then just read this catalog via DataCatalog('./path_to/final.yml').
This would mean though that include is no longer available as a variable for a dataset.
How to include this is of course is up for debate. But I think this would be nice to have.
Use case
Where data catalog yaml's would become very large or where there would be a lot of seperate data catalog yaml's to be put in the data_libs list.
Additional Context
No response
I'm not sure if this type of functionality is really needed. In the hydromt build/update configuration under global you can already list the data catalogs you want to use as well instead of using the command line.
global:
data_libs:
- data_catalog1.yml
- data_catalog2.yml
I agree this has low priority. However it should also be rely straight forward to implement and I can image it can help to organize your data catalogs (if many). I think we should include it. Given the priority it won't be this year I think though. I've now added it to Q1 but we will discuss during the Q1 planning if that is actually feasible.
noting this for the discussion when it becomes relevant: this will need a way to deal with conflicting information, espeically if aliases will still be there. i.e. if Cat A says X -> Y and Cat B says X-> Z what should the correct result be?
noting this for the discussion when it becomes relevant: this will need a way to deal with conflicting information, especially if aliases will still be there. i.e. if Cat A says X -> Y and Cat B says X-> Z what should the correct result be?
I think we already handle this with a warning, whereby a source in the last cat overwrites the previous. This can already occur with the functionality shown by Hélène above.
V1 has an entrypoint for adding predefined data catalogs for plugins therefore I think this is no longer necessary