doit icon indicating copy to clipboard operation
doit copied to clipboard

doit auto crashes with python 3.8 on MacOS

Open boileaum opened this issue 4 years ago • 11 comments

Describe the bug

Considering the following dodo.py file :

/private/tmp$ cat dodo.py 
def task_hello():
    """hello"""

    def python_hello(targets):
        with open(targets[0], "a") as output:
            output.write("Python says Hello World!!!\n")

    return {
        'actions': [python_hello],
        'targets': ["hello.txt"],
        }

doit auto works with python 3.7 :

/private/tmp$ doit --version
0.33.1
lib @ /private/tmp/.env/lib/python3.7/site-packages/doit
/private/tmp$ doit auto
.  hello
^CProcess Process-1:
Traceback (most recent call last):
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/private/tmp/.env/lib/python3.7/site-packages/doit/cmd_auto.py", line 131, in run_watch
    file_watcher.loop()
  File "/private/tmp/.env/lib/python3.7/site-packages/doit/filewatch.py", line 100, in loop
    self._loop_darwin()
  File "/private/tmp/.env/lib/python3.7/site-packages/doit/filewatch.py", line 72, in _loop_darwin
    observer.run()
  File "/private/tmp/.env/lib/python3.7/site-packages/fsevents.py", line 116, in run
    self.event.wait()
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 552, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/local/opt/[email protected]/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt

but crashes with python 3.8 :

/private/tmp$ doit --version
0.33.1
lib @ /private/tmp/.env-3.8/lib/python3.8/site-packages/doit
/private/tmp$ doit auto
Traceback (most recent call last):
  File "/private/tmp/.env-3.8/lib/python3.8/site-packages/doit/doit_cmd.py", line 190, in run
    return command.parse_execute(args)
  File "/private/tmp/.env-3.8/lib/python3.8/site-packages/doit/cmd_base.py", line 150, in parse_execute
    return self.execute(params, args)
  File "/private/tmp/.env-3.8/lib/python3.8/site-packages/doit/cmd_auto.py", line 139, in execute
    proc.start()
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
    return Popen(process_obj)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/boileau/opt/anaconda3/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object

Environment

  1. OS: MacOS 10.15.6
  2. python version: Python 3.8.3
  3. doit version: 0.33.1
Fund with Polar

boileaum avatar Sep 04 '20 13:09 boileaum

This seems specific to MacOS. No problem inside a python:3 docker container (run from the MacOS environment) :

tmp$ docker run --rm -ti python:3 /bin/bash
root@ea1181c1d2ae:/# pip install doit
Collecting doit
  Downloading doit-0.33.1-py3-none-any.whl (84 kB)
     |████████████████████████████████| 84 kB 2.3 MB/s 
Collecting pyinotify; sys_platform == "linux"
  Downloading pyinotify-0.9.6.tar.gz (60 kB)
     |████████████████████████████████| 60 kB 4.4 MB/s 
Collecting cloudpickle
  Downloading cloudpickle-1.6.0-py3-none-any.whl (23 kB)
Building wheels for collected packages: pyinotify
  Building wheel for pyinotify (setup.py) ... done
  Created wheel for pyinotify: filename=pyinotify-0.9.6-py3-none-any.whl size=25339 sha256=a44970bc035a3a09e43ae6d288d555faa55855e002916cd196dfe4d87ece31ce
  Stored in directory: /root/.cache/pip/wheels/9d/a0/4b/1a80814e4ad0b035c07831ea1b06b691046198492bbc5769b6
Successfully built pyinotify
Installing collected packages: pyinotify, cloudpickle, doit
Successfully installed cloudpickle-1.6.0 doit-0.33.1 pyinotify-0.9.6
root@ea1181c1d2ae:/# python --version
Python 3.8.5
root@ea1181c1d2ae:/# cat > dodo.py <<- EOM
def task_hello():
    """hello"""

    def python_hello(targets):
        with open(targets[0], "a") as output:
            output.write("Python says Hello World!!!\n")

    return {
        'actions': [python_hello],
        'targets': ["hello.txt"],
        }
EOM
root@ea1181c1d2ae:/# doit
.  hello
root@ea1181c1d2ae:/# doit auto
.  hello
^Croot@ea1181c1d2ae:/#

boileaum avatar Sep 04 '20 13:09 boileaum

Based on #368. It seems multiprocess package work on better MAC, could you try it? I dont have a MAC dont expect a fix from me...

schettino72 avatar Sep 04 '20 14:09 schettino72

Python 3.8 for macOS changed the behavior of multiprocessing, and it now defaults to spawning new processes and communicating with pickles. This looks like doit is trying to run the action with multiprocessing and passes a file object to it, which is not pickleable.

Kwpolska avatar Sep 04 '20 14:09 Kwpolska

Indeed, replacing multiprocessing by multiprocess solves the problem.

boileaum avatar Sep 04 '20 14:09 boileaum

great. Now I am curious to know if works because it is better at handling pickled objects, or because it did not change the default to use spawning.

schettino72 avatar Sep 04 '20 15:09 schettino72

They patched it back to fork: https://github.com/uqfoundation/multiprocess/blob/0f7a383d6087633eb2d7fed45b76c6c79ad9b88a/py3.8/multiprocess/context.py#L315

Kwpolska avatar Sep 04 '20 15:09 Kwpolska

Uhhmmm. It says that using fork on MAC is not reliable... So using multiprocess might be a regression in some cases. right?

schettino72 avatar Sep 04 '20 18:09 schettino72

Yes, but it can be fixed by manually setting it back to fork with multiprocess.set_start_method('spawn') before spawning a Process. I tested with the above snipper and auto, and it worked in spawn mode with multiprocess.

Kwpolska avatar Sep 04 '20 20:09 Kwpolska

Perhaps unsurprisingly, it appears there are some related issues with python 3.9 on OSX:

=================================== FAILURES ===================================
__________________________ TestAuto.test_invalid_args __________________________

self = <tests.test_cmd_auto.TestAuto object at 0x7fc3a33fd5b0>
        cmd = CmdFactory(cmd_auto.Auto, task_loader=task_loader)
        # terminates with error number
>       assert cmd.parse_execute(['t2']) == 3

tests/test_cmd_auto.py:60: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/site-packages/doit/cmd_base.py:150: in parse_execute
    return self.execute(params, args)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/site-packages/doit/cmd_auto.py:139: in execute
    proc.start()
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/process.py:121: in start
    self._popen = self._Popen(self)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/context.py:224: in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/context.py:284: in _Popen
    return Popen(process_obj)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/popen_spawn_posix.py:32: in __init__
    super().__init__(process_obj)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/popen_fork.py:19: in __init__
    self._launch(process_obj)
../../_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold/lib/python3.9/multiprocessing/popen_spawn_posix.py:47: in _launch
    reduction.dump(process_obj, fp)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

obj = <Process name='Process-1' parent=1240 initial>
file = <_io.BytesIO object at 0x7fc3a3b45810>, protocol = None

    def dump(obj, file, protocol=None):
        '''Replacement for pickle.dump() using ForkingPickler.'''
>       ForkingPickler(file, protocol).dump(obj)
E       TypeError: cannot pickle 'EncodedFile' object

bollwyvl avatar Oct 11 '20 15:10 bollwyvl

As discussed on the recent scipy thread, the aging/non-existent libraries that support the auto feature are increasing the difficulty of keeping doit packaged.

Proposed over there was moving auto into an [extra], such that e.g. pip check would be appeased without any os-specific binary packages installed.

At a deeper level: having a generic watcher is useful, and having a robust, yet optional, cross-platform solution is still very attractive.

It looks like watchfiles, based on rust, would be a good candidate for this in the future. It has a far simpler API than either macfsevents or pyinotify and supports windows (#17).

bollwyvl avatar Apr 09 '22 17:04 bollwyvl

@bollwyvl moving discussion to #404

schettino72 avatar Apr 10 '22 07:04 schettino72