Spine-Toolbox
Spine-Toolbox copied to clipboard
Writing files in Tool while executing in script directory may fail during parallel execution
Consider a Tool that reads a database and writes something to a file. If there are multiple scenario/tool filters defined for that Tool then multiple tool instances will be executed in parallel. If 'Execute in work directory' is not set or no work directory is availble in settings, the instances will execute in the script directory. This will result in multiple tool instances writing to the same directory and to the same output file overwriting each other's results. Can we somehow prevent this?
Very nice catch. I feel the work directory is already the solution to this problem. We just need to create one work directory per parallel execution. Execution in source should be disabled in case of parallel executions for this very reason.
There's already something but it's with the 'results' directory, where we create one subdirectory per 'filter id'. I feel that same approach can be extended to the work directory generation.
We actually already create a unique work directory for each tool instance but only if we execute in a specific work directory, not when a Tool is executed in the script directory.
We can't execute in script (source) directory in parallel, so we need to tell the users that if they try... and then execute in work.
The work directory name is unique per execution, true, so nothing needs to be done. In the future we may consider appending the filter_id
anyways in case it provides extra useful information?
We can't execute in script (source) directory in parallel, so we need to tell the users that if they try... and then execute in work.
This seems like the best solution, indeed.
In the future we may consider appending the filter_id anyways in case it provides extra useful information?
Not a bad idea at all.
@soininen This is done? Close?
Still an issue/cannot close.
It would be possible to execute tools in parallel that do not use files (provided that reading the same script file by two or more processes is not an issue).
It would also be possible to execute tools in series in the work directory.
Both are bit edge cases, but would be nice to cover.
Edit: Could be enough to give a warning: "Parallel runs may cause overwriting of input/output files in the source directory! All may be fine, if no files are used or if Toolbox execution is allowed to use only one process." (thanks Antti).