magnus-core
magnus-core copied to clipboard
Cached catalog
To solve the problem of sourcing data and also managing large datasets. Bring in a cached catalog.
The behaviour is as follows:
In either GET or PUT, we do not actually move any files around.
- For any of the contents in the GET, we keep track of the MD5 hash of the files.
- Same for PUT, we keep track of the MD5 hash.
Open question, should we keep the hash of input files and output files of all or only the changed ones?
Still in a branch, have not merged yet. https://github.com/AstraZeneca/runnable/tree/do-not-copy-catalog
Wrong design, it should be store_copy on put, and it is fixed.