alephclient
alephclient copied to clipboard
Add load-catalog cli command
This is a first lay out for a load-catalog
command.
It accepts an url to a catalog index json and a few options (include/exclude specific datasets), creates or updates the collections metadata and bulk loads the entities into aleph.
A few questions arise:
- should we just safely assume to map dataset names to aleph
foreign_id
? This could write entities into existing collections with the sameforeign_id
that are originally not meant to be the dataset from the catalog - do we want more logic (e.g. skip importing for not updated since ...)
- and something like "inspect catalog" to see some metadata without actually importing
- how to handle collections/children?