miniwdl mitigate cache invalidation caused by write

mitigate cache invalidation caused by write_* used in workflows

Open mlin opened this issue 3 years ago • 0 comments

The write_* functions always generate a unique filename in the current workflow run directory. So if we have two sequential tasks in a workflow, where the second task consumes the output of the first and an intervening write_* declaration, the second task can never be cached even if the first one is.

Perhaps we can use a content digest just for write_* files, which shouldn't be very large since they necessarily represent data structures that fit in the runner memory.

Mar 15 '21 07:03 mlin

miniwdl miniwdl copied to clipboard

mitigate cache invalidation caused by write_* used in workflows

miniwdl
miniwdl copied to clipboard