magnus-core icon indicating copy to clipboard operation
magnus-core copied to clipboard

Cached catalog

Open vijayvammi opened this issue 1 year ago • 1 comments

To solve the problem of sourcing data and also managing large datasets. Bring in a cached catalog.

The behaviour is as follows:

In either GET or PUT, we do not actually move any files around.

  • For any of the contents in the GET, we keep track of the MD5 hash of the files.
  • Same for PUT, we keep track of the MD5 hash.

Open question, should we keep the hash of input files and output files of all or only the changed ones?

vijayvammi avatar Nov 23 '24 14:11 vijayvammi

Still in a branch, have not merged yet. https://github.com/AstraZeneca/runnable/tree/do-not-copy-catalog

vijayvammi avatar Mar 10 '25 14:03 vijayvammi

Wrong design, it should be store_copy on put, and it is fixed.

vijayvammi avatar Aug 31 '25 05:08 vijayvammi