Spine-Toolbox icon indicating copy to clipboard operation
Spine-Toolbox copied to clipboard

Writing files in Tool while executing in script directory may fail during parallel execution

Open soininen opened this issue 4 years ago • 7 comments

Consider a Tool that reads a database and writes something to a file. If there are multiple scenario/tool filters defined for that Tool then multiple tool instances will be executed in parallel. If 'Execute in work directory' is not set or no work directory is availble in settings, the instances will execute in the script directory. This will result in multiple tool instances writing to the same directory and to the same output file overwriting each other's results. Can we somehow prevent this?

soininen avatar Feb 22 '21 14:02 soininen

Very nice catch. I feel the work directory is already the solution to this problem. We just need to create one work directory per parallel execution. Execution in source should be disabled in case of parallel executions for this very reason.

There's already something but it's with the 'results' directory, where we create one subdirectory per 'filter id'. I feel that same approach can be extended to the work directory generation.

manuelma avatar Feb 22 '21 14:02 manuelma

We actually already create a unique work directory for each tool instance but only if we execute in a specific work directory, not when a Tool is executed in the script directory.

soininen avatar Feb 23 '21 07:02 soininen

We can't execute in script (source) directory in parallel, so we need to tell the users that if they try... and then execute in work.

The work directory name is unique per execution, true, so nothing needs to be done. In the future we may consider appending the filter_id anyways in case it provides extra useful information?

manuelma avatar Feb 23 '21 07:02 manuelma

We can't execute in script (source) directory in parallel, so we need to tell the users that if they try... and then execute in work.

This seems like the best solution, indeed.

In the future we may consider appending the filter_id anyways in case it provides extra useful information?

Not a bad idea at all.

soininen avatar Feb 23 '21 08:02 soininen

@soininen This is done? Close?

jkiviluo avatar Sep 23 '21 07:09 jkiviluo

Still an issue/cannot close.

soininen avatar Sep 23 '21 07:09 soininen

It would be possible to execute tools in parallel that do not use files (provided that reading the same script file by two or more processes is not an issue).

It would also be possible to execute tools in series in the work directory.

Both are bit edge cases, but would be nice to cover.

Edit: Could be enough to give a warning: "Parallel runs may cause overwriting of input/output files in the source directory! All may be fine, if no files are used or if Toolbox execution is allowed to use only one process." (thanks Antti).

jkiviluo avatar Nov 23 '21 13:11 jkiviluo