hydromt icon indicating copy to clipboard operation
hydromt copied to clipboard

Include data catalogs in other data catalogs

Open dalmijn opened this issue 2 years ago • 4 comments

Kind of request

Adding new functionality

Enhancement Description

The ability to include data catalog's (yaml's) via e.g. an include statement in another data catalog (e.g. final.yml). The desired result then would be that the datasets from all the data catalog's are available in HydroMT.

Take e.g. a data catalog (data_catalog1.yml)

my_data:
  - meta: meta

And then include it in another data catalog (e.g. final.yml)

include:
  - data_catalog1.yml
  - data_catalog2.yml

era5_daily:
  - stuff: stuff
  - more stuff: more stuff

And then just read this catalog via DataCatalog('./path_to/final.yml').

This would mean though that include is no longer available as a variable for a dataset. How to include this is of course is up for debate. But I think this would be nice to have.

Use case

Where data catalog yaml's would become very large or where there would be a lot of seperate data catalog yaml's to be put in the data_libs list.

Additional Context

No response

dalmijn avatar Aug 10 '23 11:08 dalmijn

I'm not sure if this type of functionality is really needed. In the hydromt build/update configuration under global you can already list the data catalogs you want to use as well instead of using the command line.

global:
  data_libs:
    - data_catalog1.yml
    - data_catalog2.yml

hboisgon avatar Sep 21 '23 06:09 hboisgon

I agree this has low priority. However it should also be rely straight forward to implement and I can image it can help to organize your data catalogs (if many). I think we should include it. Given the priority it won't be this year I think though. I've now added it to Q1 but we will discuss during the Q1 planning if that is actually feasible.

DirkEilander avatar Oct 18 '23 15:10 DirkEilander

noting this for the discussion when it becomes relevant: this will need a way to deal with conflicting information, espeically if aliases will still be there. i.e. if Cat A says X -> Y and Cat B says X-> Z what should the correct result be?

savente93 avatar Oct 18 '23 15:10 savente93

noting this for the discussion when it becomes relevant: this will need a way to deal with conflicting information, especially if aliases will still be there. i.e. if Cat A says X -> Y and Cat B says X-> Z what should the correct result be?

I think we already handle this with a warning, whereby a source in the last cat overwrites the previous. This can already occur with the functionality shown by Hélène above.

DirkEilander avatar Oct 18 '23 15:10 DirkEilander

V1 has an entrypoint for adding predefined data catalogs for plugins therefore I think this is no longer necessary

savente93 avatar Mar 24 '25 12:03 savente93