intake icon indicating copy to clipboard operation
intake copied to clipboard

Programmatically add a catalog to Intake

Open echarles opened this issue 3 years ago • 4 comments

On https://intake.readthedocs.io/en/latest/quickstart.html#adding-data-source-packages-using-the-intake-path I read `Adding Data Source Packages using the Intake path: Intake checks the Intake config file for catalog_path or the environment variable "INTAKE_PATH" for a colon separated list of paths (semicolon on windows) to search for catalog files. When you import intake we will see all entries from all of the catalogues referenced as part of a global catalog called intake.cat``

  1. Should the title be Adding Catalog Packages... instead of Adding Data Source Packages.
  2. Is there an API to add a catalog to intake? (*)

(*) eg.

cat = intake.open_catalog('us_states.yml')
intake.add_catalog(cat) # ???

echarles avatar Oct 25 '22 08:10 echarles

Found how to add to the gui with intake.gui.add(cat), still looking adding directly to intake.

echarles avatar Oct 25 '22 09:10 echarles

No, there is no runtime API to add entries to intake.cat, although intake.cat._catalogs is a simple list. You are not intended to "install" a data package except with the standard packaging tools pip or conda. The two places searched are:

  • entrypoints (I don't think you can add these at runtime without monkeypatching)
  • YAML files in the user-specific intake data folder, or as specified by environment variables; this will occasionally reload itself.

Of course, you can always intake.open_catalog or use a service-specific catalog driver at runtime without adding the thing to intake.cat, which is only a convenience.

martindurant avatar Oct 25 '22 17:10 martindurant

Thx for the reply. I now understand the well-known folders as the data package entrypoints are used to build that catalog list. I am still under the impression that a programmatic API would be useful in some cases, eg dynamically building the list of catalogs after importing intake. Adding directly to the private _catalog field may work, but at some point other internal structures may need to be updated while adding a catalog, so I would not go that road.

I propose to leave this issue opened a bit to gather any further feedback from the community.

echarles avatar Oct 26 '22 06:10 echarles

The question is, why would those runtime catalogs need to appear under intake.cat, given that you can make any number of catalog type instances?

martindurant avatar Oct 26 '22 12:10 martindurant