multi-core-python
multi-core-python copied to clipboard
Can't unpickle objects defined in __main__
(https://bugs.python.org/issue37292)
ORIGINAL POST:
As of CPython 3.8.0b1, main branch (please let me know if there's a different branch I should use):
If one pickles an object that is defined in the __main__ module, sends it to a subinterpreter as bytes, and then tries unpickling it there, it fails saying that __main__ doesn't define it.
import _xxsubinterpreters as interpreters
import pickle
class C:
pass
c = C()
interp_id = interpreters.create()
c_bytes = pickle.dumps(c)
interpreters.run_string(
interp_id,
"import pickle; pickle.loads(c_bytes)",
shared={"c_bytes": c_bytes},
)
If the above is executed directly with the python command-line, it fails. If it's imported from another module, it works.
I'm unsure if that's working as intended or not; I was expecting behaviour compatible with sub-processes spawned with the spawn method, where the__main__ of the parent process is visilble to the subprocess too.
Workarounds:
1 - define everything that must be pickled in an imported module
2 - use cloudPickle, which implements a hardcoded special case that makes it pickle the whole code of any object defined in __main__.
Possible future solutions:
- opt-in support for re-running
__main__in a subinterpreter - a helper for re-running
__main__in a subinterpreter - support in PEP 499 for updating
__module__for all objects in__main__ - opt-in support for mirroring a subinterpreter's
__main__as a named module (insys.modules)- this is similar to PEP 499
- also update
__module__for all objects
This definitely sounds like a bug. :( Thanks for finding that! Please open a new issue on bugs.python.org and feel free to nosy me.
https://bugs.python.org/issue37292
@crusaderky, Thanks!
I'm going to track this here along with other subinterpreter-related bugs that need short-term attention.
I've looked into this issue and reproduced the reported behaviour, but it's not clear to me why this should be expected to work?
I was expecting behaviour compatible with sub-processes spawned with the spawn method, where the__main__ of the parent process is visilble to the subprocess too.
@crusaderky Could you help me out by providing a code snippet that reproduces the behaviour you're referring to here?
@LewisGaul
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import get_context
def f():
print("Hello world")
if __name__ == "__main__":
with ProcessPoolExecutor(mp_context=get_context("spawn")) as ex:
ex.submit(f).result()
f is being pickled in the main process and unpickled in the slave process.
A bit clearer:
import os
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import get_context
class C:
def __getstate__(self):
print("pickled in %d" % os.getpid())
return {}
def __setstate__(self, state):
print("unpickled in %d" % os.getpid())
def hello(self):
print("Hello world")
if __name__ == "__main__":
with ProcessPoolExecutor(mp_context=get_context("spawn")) as ex:
ex.submit(C().hello).result()
Output:
pickled in 23480
unpickled in 23485
Hello world
Ah yes ok, thanks for the example. I'll take a look at how subprocess achieves this and see if I can work out what needs doing for subinterpreters.
Thanks for looking into this, @LewisGaul (and @crusaderky). Please continue the discussion, but do it over on BPO.
FYI, this repo is intended mostly for coordinating effort and breaking down. We want to stick to the normal core development workflow as much as possible. Plus you're likely to get better involvement from the community that way (even if no one has chimed in on the issue there yet). :)
Noted, I'll shift the discussion to there :)