doit
doit copied to clipboard
Parallel execution issues on macOS
(sorry for the wall of text, I tried to give as much info as possible, feel free to ask me clarification if something is unclear)
Context
It seems like the parallel execution feature of doit
doesn't work on macOS (I noticed it on a coworker's laptop, being on Linux I never noticed any issue on my side).
I investigated on my side and came up with a satisfying workaround/fix (force the use of threads on macOS).
Long story short, this is related to the fact that by default:
-
doit
usesmultiprocessing
for parallel execution -
multiprocessing
usesfork
on Unix-like system - recent version of macOS introduced a safety check against mixing forks and threads.
For more information you can take a look at the issue I tackled on our side (https://github.com/scality/metalk8s/issues/1354).
Here is a minimal dodo.py
that reproduce the problem if you exectute on macOS (10.13 or higher):
#!/usr/bin/env python3
# coding: utf-8
import random
import time
import urllib.request
import doit
def task_foo():
def action():
urllib.request.urlopen('http://example.com/')
time.sleep(random.randint(1, 5))
for i in range(10):
yield {
'name': str(i),
'actions': [action]
}
To be executed with doit -n 4
.
Note that the crash happen only when you're calling into Cocoa libraries (that why I added an urllib
call, simply sleeping doesn't exhibit the behavior).
What can be done
I know it's not really an doit
issue per se, and after reading this very informative thread on the Python bugtracker one can see that the issue will probably disappear with Python 3.8 (where multiprocessing
will be using spawn
instead of fork
by default).
But I believe that we may want to do something on doit
side because Python 3.8 is not out yet (and people will keep using older Python for a while), so it may be interesting to have a fix/workaround in doit
itself.
I think that the issue can be solved by calling either multiprocessing.set_start_method('forkserver')
or multiprocessing.set_start_method('spawn')
(I would go for spawn
since official Python is moving that way) before creating any subprocesses in doit
.
If it doesn't work/it's too much work/outside the scope of doit
, it's possible to instead mention the issue in the documentation about parallel execution (so that people are not surprised/know how to deal with).
Or maybe simply disabling the use of multiprocessing
on macOS (and using only threads).
In any case, if you want to keep the support for multiprocessing
on macOS, you may want to take a look now because the current code will not work in 3.8 (which will be using spawn
by default): adding a call to multiprocessing.set_start_method('spawn')
in the provided dodo.py
triggers a pickle error (which is not totally unexpected as spawn
has more constraints thant fork
, see here).
Environment
- OS: macOS 10.14.3
- Python version: 3.7.3
- doit version: 0.31.1
Thanks for clear explanation.
I just tried doit with following patch:
diff --git a/doit/__main__.py b/doit/__main__.py
index b6a1481..b7818b7 100644
--- a/doit/__main__.py
+++ b/doit/__main__.py
@@ -1,6 +1,9 @@
# lazy way to ignore coverage in this file
if True: # pragma: no cover
def main():
+ import multiprocessing as mp
+ mp.set_start_method('spawn')
+
import sys
from doit.doit_cmd import DoitMain
I test this using doit own dodo.py
task pyflakes
.
Works for me on linux.
forkserver
also worked but was considerably slower.
I suggest a command line argument to control this...
Or maybe an ENV variable. Docs mention that this must be on __main__
section. Again my patch works on linux but not sure about other platforms.
And make fork
mode not available on MAC.
Regarding changing the default. I am not clear in which situation you would have pickle
problems with spawn
but not with fork
. Could you provide a minimum example?
Yes forkserver
is notoriously slower (that's probably why Python is moving to spawn
instead of forkserver
for macOS).
Using the dodo.py
file given in the first message and applying your patch to the doit/__main__.py
in my virtualenv, when I run doit -n 4
I get the following traceback (even on Linux):
Traceback (most recent call last):
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/doit_cmd.py", line 177, in run
return command.parse_execute(args)
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/cmd_base.py", line 127, in parse_execute
return self.execute(params, args)
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/cmd_base.py", line 420, in execute
return self._execute(**exec_params)
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/cmd_run.py", line 260, in _execute
return runner.run_all(self.control.task_dispatcher())
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/runner.py", line 256, in run_all
self.run_tasks(task_dispatcher)
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/runner.py", line 459, in run_tasks
proc_list = self._run_start_processes(job_q, result_q)
File "/home/sylvain/dev/poc_doit/.venv/lib/python3.7/site-packages/doit/runner.py", line 432, in _run_start_processes
process.start()
File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in __init__
self._launch(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'task_foo.<locals>.action'
According to the documentation spawn
has a few extra restriction which don’t apply to the fork
start method (e.g.: see the "More picklability" section).
Oh, sorry. You had already given an example.
So the problem is usage of a closure as an action. Windows also has this limitation...
Yes, I saw the docs regarding extra restrictions. Although it doesnt mention closures and i am not sure which restrictions from docs is the offending one. I would not know how to make closures work.
I will stick to my proposed solution of adding a command line argument to control this. And by default just use python's default. Maybe add a note on docs for MAC and Windows users to not use closures on actions.
Ha yes, indeed: moving action
outside (and thus making it a normal function instead of a closure) solve the issue.
I'll do the test on the macOS of my coworker on Monday to confirm that it works there as well, but that looks good to me :slightly_smiling_face:
I've just tried your patch on macOS and it works as expected.
good, but i dont want to force use spawn on linux...
So a fix/patch would need to:
- [ ] multiprocessing start method set to
spawn
by default on mac - [ ] add a command line argument to control start method
- [ ] document that closures on actions are not supported on windows/mac
Looks good to me :slightly_smiling_face:
Maybe add a check if someone set fork
on mac (using the command line argument) and raise an error or print a warning in that case.