pydra icon indicating copy to clipboard operation
pydra copied to clipboard

*.define() `no_cache` option for tasks that modify persistent data

Open tclose opened this issue 9 months ago • 12 comments

What would you like changed/added and why?

Add a no_cache option to the pydra.compose.*.design() functions to indicate that these tasks cannot be cached.

Currently, if a task attempts to modify a persistent data store (e.g. a directory) multiple times, and the store either doesn't capture its current state fully or the state is reset (e.g. an output sub-directory deleted) and the same input parameters are reused the modification won't happen in the subsequent runs.

For such tasks a purely random checksum could be generated to guarantee a unique working directory. However, this will require #784 or similar functionality otherwise downstream nodes won't know where to find this directory.

What would be the benefit? Does the change make something easier to use?

Subsequent task runs will be executed no matter what so the modifications are guaranteed to be made

tclose avatar Apr 09 '25 05:04 tclose