PyOxidizer icon indicating copy to clipboard operation
PyOxidizer copied to clipboard

Trying to use multiprocessing results in an infinite loop on windows 10

Open jevansbio opened this issue 2 years ago • 2 comments

I have run into an issue where creating a pool causes an infinite loop. After some testing, it seems not a function of the original script I was running, and was rather easy to recreate on the latest version.

TCL:

def make_exe():
    dist = default_python_distribution()

    policy = dist.make_python_packaging_policy()

    python_config = dist.make_python_interpreter_config()
    python_config.run_module = "testscript"

    exe = dist.to_python_executable(
        name="testbuild",
        packaging_policy=policy,
        config=python_config,
    )  

    # Recursively scan the filesystem at 'path' and grab matching 'packages'
    exe.add_python_resources(
        exe.read_package_root(
            path=".",
            packages=["testscript"],
        )
    )
    
    exe.windows_runtime_dlls_mode = "always"
    #exe.tcl_files_path = "lib"  
    return exe


def make_embedded_resources(exe):
    return exe.to_embedded_resources()


def make_install(exe):
    # Create an object that represents our installed application file layout.
    files = FileManifest()
    # Add the generated executable to our install layout in the root directory.
    files.add_python_resource(".", exe)
    return files


register_target("exe", make_exe)
register_target(
    "resources", make_embedded_resources, depends=["exe"], default_build_script=True
)
register_target("install", make_install, depends=["exe"], default=True)

resolve_targets()

Test script:

import platform
import sys
from multiprocessing import freeze_support, set_start_method, get_start_method,spawn,Pool

def f(x):
    return x*x

def Main():
    #I also tried a version where this was not in a seperate script, but rather in the 'if name' block
    #set_start_method('spawn')
    print(get_start_method())
    try:
        print(sys.frozen)
    except:
        print("False")
    print(spawn.get_executable())
    print(sys.executable)
    

    print(platform.machine())
    print(platform.version())
    print(platform.platform())
    print(platform.uname())
    print(platform.system())
    print(platform.processor())

    
    with Pool() as pool:
        print(pool.map(f, range(10)))
        
if __name__=="__main__":
    freeze_support() #theoretically not needed I think? Doesn't help anyway.
    Main()

Output:

spawn #correct start method

True #sys.frozen is True

...build\x86_64-pc-windows-msvc\debug\install\testbuild.exe #exe seems to be correct
...build\x86_64-pc-windows-msvc\debug\install\testbuild.exe
AMD64
10.0.19042
Windows-10-10.0.19042
uname_result(system='Windows', release='10', version='10.0.19042', machine='AMD64')
Windows
Intel64 Family 6 Model 126 Stepping 5, GenuineIntel

Traceback (most recent call last):
  File "multiprocessing.spawn", line 107, in spawn_main
  File "multiprocessing.reduction", line 79, in duplicate
TypeError: DuplicateHandle() argument 2 must be int, not dict
#with this traceback repeating forever until the process is killed

The actual error seems to be related to this:

https://bugs.python.org/issue38188

but it still seems to me that something odd is happening to produce so many errors, as well as something being incorrectly handled? If it is purely this error, anyone got any insight into how to monkey patch the pyoxidizer dist?

jevansbio avatar Mar 21 '22 19:03 jevansbio

Just wondering if anyone else has managed to recreate this issue? I have recreated it on two different machines so far, but not sure if there is something obvious I am missing.

jevansbio avatar Apr 26 '22 08:04 jevansbio

I can successfully reproduce this issue. On interpreter.rs PyOxidizer makes this call into Python (https://github.com/indygreg/PyOxidizer/blob/main/pyembed/src/interpreter.rs#L576): spawn_module.getattr("spawn_main")?.call1((kwargs,))?; which calls spawn_main (https://github.com/python/cpython/blob/3.11/Lib/multiprocessing/spawn.py#L111) first argument being pipe_handle (which apparently is understood as a dictionary but should be an integer) then this gets passed to DuplicateHandle (https://github.com/python/cpython/blob/3.11/Lib/multiprocessing/reduction.py#L88) which is where execution fails.

diogofriggo avatar Feb 14 '23 13:02 diogofriggo