cpython icon indicating copy to clipboard operation
cpython copied to clipboard

multiprocessing.Process generates FileNotFoundError when argument isn't explicitly referenced

Open JZerf opened this issue 2 years ago • 11 comments

Bug report This is a continuation for the possible bug mentioned in issue https://github.com/python/cpython/issues/82236 which was closed because DonnyBrown, the submitter, didn't provide enough information.

DonnyBrown was getting a FileNotFoundError when starting a process with multiprocessing.Process that uses an argument that doesn't have an explicit reference. I'm able to reproduce the same error using the test code DonnyBrown provided in that issue on Ubuntu Desktop LTS 22.04 x86-64 with CPython 3.10.4. @iritkatriel mentioned that they were unable to reproduce the error on Windows 10 with Python 3.10.

I can also reproduce the error using this slightly modified/simpler version of DonnyBrown's test code that I have been testing:

import multiprocessing

def demo(argument):
    print(argument)

if __name__=="__main__":
    multiprocessing.set_start_method("spawn") # Changing this to "fork" (on platforms where it is
                                              # available) can also cause the below code to work.


    process=multiprocessing.Process(target=demo, args=[multiprocessing.Value("i", 0)]) # FAILS

    #process=multiprocessing.Process(target=demo, args=[0])                            # WORKS

    #reference_To_Number=multiprocessing.Value("i", 0)                                 # WORKS
    #process=multiprocessing.Process(target=demo, args=[reference_To_Number])


    process.start()
    process.join()

The traceback I get with the above code is:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/usr/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
  File "/usr/lib/python3.10/multiprocessing/synchronize.py", line 110, in __setstate__
    self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory

The above code can be made to work on my test system by making any of the following changes:

  • Change the process start method to "fork" instead.
  • Change the process argument to a simple integer instead of a multiprocessing.Value.
  • Assign the multiprocessing.Value to a variable and change the process argument to use the variable.

I'm not a Python expert so maybe this is the expected behavior when spawning a process directly with a multiprocessing.Value but it does seem odd that making any of the above mentioned changes causes the code to work or that (based on @iritkatriel's success with DonnyBrown's test code) running it on Windows 10 (which uses the "spawn" start method) will probably cause the code to work.

Your environment

  • CPython versions tested on: 3.10.4
  • Operating system and architecture: Linux, Ubuntu Desktop LTS 22.04, x86-64

JZerf avatar Jul 11 '22 23:07 JZerf

Confirmed the issue on Python 3.9 and 3.12 on MacOS 11.5.2 .

akulakov avatar Jul 19 '22 21:07 akulakov

Confirmed the issue on python 3.8 on macos 13.0.1

curonny avatar Nov 22 '22 19:11 curonny

Confirmed the issue on Python 3.8.16 on MacOS 13.1 (22C65).

whitedemong avatar Jan 05 '23 16:01 whitedemong

Same here on Ubuntu 22.04 with Python 3.10.6. File it is looking for has 777 permissions. Specifically [working_directory]/lib/tom-select/tom-select.css, a file created by pyvis v 0.3.1 by a past run of the same script. If I rm rf lib and run it again: I get the error: [Errno 39] Directory not empty: 'vis-9.0.4'. This was stable without a lock context previously (ran thousands of times without a problem). When I nest this in a function, it still works. When I call that function as the target of a Process, I get these errors.

These errors were also very opaque, as they required a chain of try: .., except Exception as err: ... print(err) clauses to get the error to print to the console. I presume there is an issue with the pipe of stderr in this context as well.

Additionally, running the process in the context of "fork" does not resolve the issue (same error). The workaround of using args = [multiprocessing.Value(...)] instead of args=(0) throws the error TypeError: this type has no size:

Traceback (most recent call last):
  File "/[redacted]/my-script.py", line 346, in <module>
    processes_list = [ctx.Process(target=objective,
  File "/[redacted]/my-script.py", line 347, in <listcomp>
    args=[Value("trial", 0)]
  File "/usr/lib/python3.10/multiprocessing/context.py", line 135, in Value
    return Value(typecode_or_type, *args, lock=lock,
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 74, in Value
    obj = RawValue(typecode_or_type, *args)
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 49, in RawValue
    obj = _new_value(type_)
  File "/usr/lib/python3.10/multiprocessing/sharedctypes.py", line 40, in _new_value
    size = ctypes.sizeof(type_)
TypeError: this type has no size

As a (not ideal) workaround, I ultimately made the offending process operate under a subprocess.run() context as a parameterized script and used a Process() as a proxy between the main script and the actual process. That "worked for now".

david-thrower avatar Apr 22 '23 00:04 david-thrower

https://superfastpython.com/filenotfounderror-multiprocessing-python/

Тут расписано

мне помог простой time.sleep(1) после p.start()

RinatV avatar Jun 03 '23 08:06 RinatV

Confirmed the issue on Python 3.9.17 on MacOS 14.

Luferov avatar Aug 11 '23 18:08 Luferov

I was having a similar issue with sharing concurrency primitives (a multiprocessing.Queue in my case) across processes when using the spawn backend.

I believe this is happening because of ref counts / garbage collection. If there's a possibility the object gets deleted by the main process while/during/before being shared, the file isn't around when the other processes look for it and it is a FileNotFoundError. This explains why putting it a variable (preventing the object from being deallocated) works and explicitly putting into a progress argument does not.

The object getting deleted could also happen if the main process ends too soon, as referenced in this article: https://superfastpython.com/filenotfounderror-multiprocessing-python/

Starbuck5 avatar Aug 11 '23 19:08 Starbuck5

We can reproduce this problem with the following piece of code using Python 3.8.10 on Ubuntu Linux 20.04:

import multiprocessing as mp

def demo(argument):
    print(argument)

def create_process():
    arg = mp.Value("i", 0)
    return mp.Process(target=demo, args=[arg])

if __name__ == "__main__":
    mp.set_start_method("spawn")  # fails
    # mp.set_start_method("fork")  # works
    # mp.set_start_method("forkserver")  # also fails

    process = create_process()
    process.start()
    process.join()

This leads to the same stacktrace as in the OP. The issue does not seem to be related to the garbage collector, as disabling it before creating the process and enabling it after join() does also not help.

Is this a bug in CPython, or are we supposed to perform these steps in a different way?

haimat avatar Sep 15 '23 12:09 haimat

This leads to the same stacktrace as in the OP. The issue does not seem to be related to the garbage collector, as disabling it before creating the process and enabling it after join() does also not help.

The object gets deallocated anyway because the refcount reaches 0. That's not part of the garbage collector I think.

After your create_process method finishes the value of arg gets deallocated. If you create arg in the name=main block and pass it to create_process I think the issue will be solved.

Starbuck5 avatar Sep 19 '23 07:09 Starbuck5

Hitting this problem as well.

Exception ignored in: <Finalize object, dead>
Traceback (most recent call last):
  File "/usr/lib64/python3.10/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib64/python3.10/multiprocessing/synchronize.py", line 87, in _cleanup
    sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory

ziegenbalg avatar Mar 13 '24 20:03 ziegenbalg

In my conda environment I'm using python=3.12.2 however I get the warning that is referring to the multiprocessing module of python 3.8.

I've already double checked the python version using python --version and I'm using the latest version.

File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/opt/anaconda/anaconda3/lib/python3.8/multiprocessing/synchronize.py", line 87, in _cleanup
    sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory

To be specific I'm using pytorch multiprocessing in order to spawn multiple process for multi-GPUs training. This the issue that I'm facing.

zuliani99 avatar Mar 21 '24 07:03 zuliani99

Process Process-1: Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/ubuntu/Downloads/Pensieve-DRL-Master-thesis/pensieve-pytorch/hyp_param_test.py", line 114, in central_agent s_batch, a_batch, r_batch, terminal, info, net_env = exp_queues[i].get() # for all the 3 agents , so a vector of size 3 (i.e s,a,r_batch) File "/usr/lib/python3.10/multiprocessing/queues.py", line 122, in get return _ForkingPickler.loads(res) File "/home/ubuntu/.local/lib/python3.10/site-packages/torch/multiprocessing/reductions.py", line 495, in rebuild_storage_fd fd = df.detach() File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 57, in detach with _resource_sharer.get_connection(self._id) as conn: File "/usr/lib/python3.10/multiprocessing/resource_sharer.py", line 86, in get_connection c = Client(address, authkey=process.current_process().authkey) File "/usr/lib/python3.10/multiprocessing/connection.py", line 502, in Client c = SocketClient(address) File "/usr/lib/python3.10/multiprocessing/connection.py", line 630, in SocketClient s.connect(address) FileNotFoundError: [Errno 2] No such file or directory

Is this issue on all Python 3.x versions?

Chidu2000 avatar Mar 30 '24 16:03 Chidu2000

Confirmed the issue on python 3.10 on macos 14.4.1

yeonfish6040 avatar Apr 11 '24 02:04 yeonfish6040