multi-core-python icon indicating copy to clipboard operation
multi-core-python copied to clipboard

Can't unpickle objects defined in __main__

Open crusaderky opened this issue 6 years ago • 9 comments

(https://bugs.python.org/issue37292)

ORIGINAL POST:

As of CPython 3.8.0b1, main branch (please let me know if there's a different branch I should use):

If one pickles an object that is defined in the __main__ module, sends it to a subinterpreter as bytes, and then tries unpickling it there, it fails saying that __main__ doesn't define it.

import _xxsubinterpreters as interpreters
import pickle


class C:
    pass


c = C()

interp_id = interpreters.create()
c_bytes = pickle.dumps(c)
interpreters.run_string(
    interp_id,
    "import pickle; pickle.loads(c_bytes)",
    shared={"c_bytes": c_bytes},
)

If the above is executed directly with the python command-line, it fails. If it's imported from another module, it works.

I'm unsure if that's working as intended or not; I was expecting behaviour compatible with sub-processes spawned with the spawn method, where the__main__ of the parent process is visilble to the subprocess too.

Workarounds: 1 - define everything that must be pickled in an imported module 2 - use cloudPickle, which implements a hardcoded special case that makes it pickle the whole code of any object defined in __main__.

Possible future solutions:

  • opt-in support for re-running __main__ in a subinterpreter
  • a helper for re-running __main__ in a subinterpreter
  • support in PEP 499 for updating __module__ for all objects in __main__
  • opt-in support for mirroring a subinterpreter's __main__ as a named module (in sys.modules)
    • this is similar to PEP 499
    • also update __module__ for all objects

crusaderky avatar Jun 12 '19 15:06 crusaderky

This definitely sounds like a bug. :( Thanks for finding that! Please open a new issue on bugs.python.org and feel free to nosy me.

ericsnowcurrently avatar Jun 15 '19 00:06 ericsnowcurrently

https://bugs.python.org/issue37292

@crusaderky, Thanks!

ericsnowcurrently avatar Jun 21 '19 15:06 ericsnowcurrently

I'm going to track this here along with other subinterpreter-related bugs that need short-term attention.

ericsnowcurrently avatar Jun 21 '19 15:06 ericsnowcurrently

I've looked into this issue and reproduced the reported behaviour, but it's not clear to me why this should be expected to work?

I was expecting behaviour compatible with sub-processes spawned with the spawn method, where the__main__ of the parent process is visilble to the subprocess too.

@crusaderky Could you help me out by providing a code snippet that reproduces the behaviour you're referring to here?

LewisGaul avatar Nov 21 '19 10:11 LewisGaul

@LewisGaul

from concurrent.futures import ProcessPoolExecutor
from multiprocessing import get_context


def f():
    print("Hello world")


if __name__ == "__main__":
    with ProcessPoolExecutor(mp_context=get_context("spawn")) as ex:
        ex.submit(f).result()

f is being pickled in the main process and unpickled in the slave process.

crusaderky avatar Nov 21 '19 22:11 crusaderky

A bit clearer:


import os
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import get_context


class C:
    def __getstate__(self):
        print("pickled in %d" % os.getpid())
        return {}

    def __setstate__(self, state):
        print("unpickled in %d" % os.getpid())

    def hello(self):
        print("Hello world")


if __name__ == "__main__":
    with ProcessPoolExecutor(mp_context=get_context("spawn")) as ex:
        ex.submit(C().hello).result()

Output:

pickled in 23480
unpickled in 23485
Hello world

crusaderky avatar Nov 21 '19 22:11 crusaderky

Ah yes ok, thanks for the example. I'll take a look at how subprocess achieves this and see if I can work out what needs doing for subinterpreters.

LewisGaul avatar Nov 21 '19 22:11 LewisGaul

Thanks for looking into this, @LewisGaul (and @crusaderky). Please continue the discussion, but do it over on BPO.

FYI, this repo is intended mostly for coordinating effort and breaking down. We want to stick to the normal core development workflow as much as possible. Plus you're likely to get better involvement from the community that way (even if no one has chimed in on the issue there yet). :)

ericsnowcurrently avatar Nov 22 '19 22:11 ericsnowcurrently

Noted, I'll shift the discussion to there :)

LewisGaul avatar Nov 22 '19 23:11 LewisGaul