pybind11
pybind11 copied to clipboard
WIP Python 3.11 related
Description
WIP / experiments
Suggested changelog entry:
@vstinner I saw you worked on safe_path. I'm wondering if you have clues that could help me understand a difference in behavior between Python 3.10.5 and 3.11.0b5:
- 3.10.5: pybind11
test_embedupdatessys.pathfromPYTHONPATH - 3.11.0b5:
PYTHONPATHis ignored in the exact same situation***
Does that ring any bells? Some details are below, JIC.
@henryiii I wonder if this could have to do with PR #3923, which introduced a new initialize_interpreter() implementation specifically for 3.11:
https://github.com/pybind/pybind11/pull/3923/files#diff-80008385afb03df2a6e6d29a0b500fbaf9593d1baba5a5bda112998ecdfdb131
*** The "exact same situation" is possibly unusual (interactively building Python from sources with configure; make install; using my own scons-based tools to build the pybind11 tests), but the only variable is the Python version.
I added fprintf to pybind11 test_embed sources: 198c9c21372047cf43856d4d01f0cc56d2f9a11f
And patched the Python sources like this:
--- Objects/unicodeobject.c.orig 2022-08-05 08:18:54.186431792 -0700
+++ Objects/unicodeobject.c 2022-08-04 17:36:36.375312426 -0700
@@ -16005,13 +16005,14 @@
init_fs_encoding(PyThreadState *tstate)
{
PyInterpreterState *interp = tstate->interp;
+ _Py_DumpPathConfig(tstate);
/* Update the filesystem encoding to the normalized Python codec name.
For example, replace "ANSI_X3.4-1968" (locale encoding) with "ascii"
(Python codec name). */
PyConfig *config = (PyConfig*)_PyInterpreterState_GetConfig(interp);
if (config_get_codec_name(&config->filesystem_encoding) < 0) {
- _Py_DumpPathConfig(tstate);
+ //_Py_DumpPathConfig(tstate);
return _PyStatus_ERR("failed to get the Python codec "
"of the filesystem encoding");
}
I'm seeing this with 3.10.5:
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = '/usr/local/google/home/rwgk/forked/build_py3110b5/lib'
program name = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/bin/python3'
isolated = 0
environment = 1
user site = 1
import site = 1
sys._base_executable = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/bin/python3'
sys.base_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5'
sys.base_exec_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5'
sys.platlibdir = 'lib'
sys.executable = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/bin/python3'
sys.prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5'
sys.exec_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5'
sys.path = [
'/usr/local/google/home/rwgk/forked/build_py3110b5/lib',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/lib/python310.zip',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/lib/python3.10',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.10.5/lib/python3.10/lib-dynload',
]
Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests/test_embed":
LOOOK PYTHONPATH /usr/local/google/home/rwgk/forked/build_py3105/lib:/foo/bar
LOOOK sys.path ['', '/usr/local/google/home/rwgk/forked/build_py3105/lib', '/foo/bar', '/usr/lib/python310.zip', '/usr/lib/python3.10', '/usr/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.10/dist-packages']
===============================================================================
All tests passed (1553 assertions in 11 test cases)
And this with 3.11.0b5:
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = '/usr/local/google/home/rwgk/forked/build_py3110b5/lib'
program name = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/bin/python3'
isolated = 0
environment = 1
user site = 1
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11'
sys._base_executable = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/bin/python3'
sys.base_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5'
sys.base_exec_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5'
sys.platlibdir = 'lib'
sys.executable = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/bin/python3'
sys.prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5'
sys.exec_prefix = '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5'
sys.path = [
'/usr/local/google/home/rwgk/forked/build_py3110b5/lib',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python311.zip',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11',
'/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11/lib-dynload',
]
Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests/test_embed":
LOOOK PYTHONPATH /usr/local/google/home/rwgk/forked/build_py3110b5/lib:/foo/bar
LOOOK sys.path ['', '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python311.zip', '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11', '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11/lib-dynload', '/usr/local/google/home/rwgk/usr_local_like/Python-3.11.0b5/lib/python3.11/site-packages']
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_embed is a Catch v2.12.2 host application.
Run with -? for options
-------------------------------------------------------------------------------
Restart the interpreter
-------------------------------------------------------------------------------
/usr/local/google/home/rwgk/forked/pybind11/tests/test_embed/test_interpreter.cpp:174
...............................................................................
/usr/local/google/home/rwgk/forked/pybind11/tests/test_embed/test_interpreter.cpp:192: FAILED:
REQUIRE( py::module_::import("external_module").attr("A")(123).attr("value").cast<int>() == 123 )
due to unexpected exception with message:
ModuleNotFoundError: No module named 'external_module'
===============================================================================
test cases: 11 | 10 passed | 1 failed
assertions: 1537 | 1536 passed | 1 failed
pybind11 seems to use a different API on Python 3.10 and Python 3.11.
PySys_SetArgvEx() called after Python init on Python 3.10 versus Py_InitializeFromConfig() on Python 3.11.
Your dump says:
PYTHONPATH = '/usr/local/google/home/rwgk/forked/build_py3110b5/lib'
...
environment = 1
So PYTHONPATH was used. It's part of the computed sys.path.
pybind11 doesn't call PySys_SetArgvEx() on Python 3.11. That's the function which changes sys.path.
- https://docs.python.org/dev/c-api/init.html#c.PySys_SetArgvEx
- https://docs.python.org/dev/c-api/init_config.html#init-config
- https://docs.python.org/dev/c-api/init_config.html#c.PyConfig.safe_path
@vstinner Thanks a lot for the explanation and the pointers!
I see PySys_SetArgvEx() is deprecated in 3.11.
So PYTHONPATH was used. It's part of the computed sys.path.
But evidently the computed sys.path is not in effect in test_embed. — What's the recommended way to restore the established behavior?
I decided to try something straightforward, using PyRun_SimpleString(), similar to what we do already for add_program_dir_to_path (I believe @henryiii did this based on your recommendation here):
--- a/include/pybind11/embed.h
+++ b/include/pybind11/embed.h
@@ -166,6 +166,9 @@ inline void initialize_interpreter(bool init_signal_handlers = true,
throw std::runtime_error(PyStatus_IsError(status) ? status.err_msg
: "Failed to init CPython");
}
+ // Emulates established behavior (prior to Python 3.11 implemented by PySys_SetArgvEx()).
+ PyRun_SimpleString("import sys, os; "
+ "sys.path[:0] = os.environ.get('PYTHONPATH', []).split(os.pathsep)");
if (add_program_dir_to_path) {
PyRun_SimpleString("import sys, os.path; "
"sys.path.insert(0, "
With that I get back the established behavior and test_embed passes in my situation. Is that approach OK?
But evidently the computed sys.path is not in effect in test_embed.
Sorry, I don't get how you manage to get PYTHONPATH environment variable being ignored. If you are looking for my help, it would help if you could write a short C or C++ program reproducing your issue. pybind11 is a very large code base that I don't know.
There seems to be some kind of conflict in expectations & observations:
Sorry, I don't get how you manage to get PYTHONPATH environment variable being ignored.
But earlier:
Your dump says:
PYTHONPATH = '/usr/local/google/home/rwgk/forked/build_py3110b5/lib'
...
environment = 1
So PYTHONPATH was used. It's part of the computed sys.path.
=========================================================================
it would help if you could write a short C or C++ program reproducing your issue.
At the moment I'm leaning more towards root-causing why environment = 1 but sys.path is not updated.
@vstinner Would you have a pointer to the cpython sources that update sys.path? (I looked around previously but didn't get lucky. I'll try again.)
At the moment I'm leaning more towards root-causing why
environment = 1butsys.pathis not updated.
Interestingly — and luckily — I'm seeing a behavior change between 3.11.0b5 and 3.11.0rc1, both installed from tar files.
Everything except 3.11.0b5 vs 3.11.0rc1 being equal, when running test_embed with
- 3.11.0b5: The
Python path configuration:output appears 1 time. - 3.11.0rc1: The
Python path configuration:output appears 13 times.
I believe it is enlightening, because only the 1st Python path configuration: has environment = 1, but the other 12 have environment = 0. As it happens, the failing unit test is after the 5th and before the 6th Python path configuration: output.
This observation got me to look again at the new 3.11-specific initialize_interpreter() implementation. We are using PyConfig_InitIsolatedConfig() which is actually documented to ignore environment variables:
- https://docs.python.org/3/c-api/init_config.html#isolated-configuration
I tried adding config.use_environment = 1; instead of my previous PyRun_SimpleString() diff, but that didn't help.
What do we want to do?
A. Document the pybind11 behavior change? (What do we want to suggest as a solution for people who relied on the old behavior?)
B. Switch to PyConfig_InitPythonConfig() instead? (What are the pros and cons?)
C. Add a bool isolated = false; to select between PyConfig_InitPythonConfig() or PyConfig_InitIsolatedConfig()?
D. Keep my PyRun_SimpleString() diff? Maybe behind a bool add_environ_pythonpath_to_path = true option?
@henryiii @Skylion007 @vstinner what is your vote or recommendation?
I don't like A but don't have enough background to decide between B, C, D. D doesn't look very general. Maybe B or C?
PyConfig_InitIsolatedConfig() sets isolated to 1 and isolated=1 implies use_environment=0.
- https://docs.python.org/dev/c-api/init_config.html#c.PyConfig.isolated
- https://docs.python.org/dev/c-api/init_config.html#init-isolated-conf
B. Switch to PyConfig_InitPythonConfig() instead? (What are the pros and cons?)
That's closer to the regular "python" program. You can then tweak configuration from these defaults. For example, it uses isolated=0 and use_environment=1. Example:
- https://docs.python.org/dev/c-api/init_config.html#c.PyConfig.use_environment : "Default: 1 in Python config and 0 in isolated config."
@vstinner Would you have a pointer to the cpython sources that update sys.path? (I looked around previously but didn't get lucky. I'll try again.)
If you are talking about PySys_SetArgvEx() that you only call on Python 3.10 and older:
- https://github.com/python/cpython/blob/73d8ffefe95791fa1f036b029cf51f907a89ee42/Python/sysmodule.c#L3322-L3363
- https://github.com/python/cpython/blob/73d8ffefe95791fa1f036b029cf51f907a89ee42/Python/pathconfig.c#L376-L513
I only now got a chance to poke some more at this, trying to use PyConfig_InitPythonConfig() instead:
- PyConfig_InitIsolatedConfig(&config);
+ PyConfig_InitPythonConfig(&config);
With that as the only diff against current master, the PYTHONPATH-related failure is resolved, but it is failing here instead:
TEST_CASE("sys.argv gets initialized properly") {
py::finalize_interpreter();
{
py::scoped_interpreter default_scope;
auto module = py::module::import("test_interpreter");
auto py_widget = module.attr("DerivedWidget")("The question");
const auto &cpp_widget = py_widget.cast<const Widget &>();
REQUIRE(cpp_widget.argv0().empty());
}
std::string cpp_widget_argv0;
{
char *argv[] = {strdup("a.out")};
py::scoped_interpreter argv_scope(true, 1, argv);
std::free(argv[0]);
auto module = py::module::import("test_interpreter");
auto py_widget = module.attr("DerivedWidget")("The question");
const auto &cpp_widget = py_widget.cast<const Widget &>();
cpp_widget_argv0 = cpp_widget.argv0();
}
REQUIRE(cpp_widget_argv0 == "a.out"); // FAILURE
py::initialize_interpreter();
}
Note that I moved the REQUIRE out, otherwise it segfaults.
With all other tested removed, this simply fails. With others tests present, it turns into a segfault (possibly triggered by the following test) even with the REQUIRE moved out.
Commenting out the std::free(argv[0]); call didn't help.
I have to give up for now. We don't use embedding, I cannot spend more time on this. For my local testing I can just use my PyRun_SimpleString() workaround.
If nobody else gets to this before the next pybind11 release, we should at least document that PYTHONPATH is not used anymore with Python 3.11 and put a "help wanted" tag on it.
Python still uses PYTHONPATH env var. I cannot explain why pybind11 no longer uses it since you didn't provide a short example reproducing your issue. Python initialization is very complicated, they are like 50+ options, configuration files, etc.
Python still uses PYTHONPATH env var. I cannot explain why pybind11 no longer uses it since you didn't provide a short example reproducing your issue.
It's purely a pybind11 issue, because we're using PyConfig_InitIsolatedConfig() for Python >= 3.11. I think it just escaped our attention and standard testing that this introduced a pybind11 behavior change.
Python initialization is very complicated, they are like 50+ options, configuration files, etc.
That's probably why I got stuck when trying PyConfig_InitPythonConfig().
To summarize:
- With
PyConfig_InitIsolatedConfig()PYTHONPATHis not used as documented (i.e. the config is working as intended). - With
PyConfig_InitPythonConfig()the unit test that depends onPYTHONPATH— in my environment (!) — is working, but a different unit test is failing. I don't know why.
With PyConfig_InitIsolatedConfig() PYTHONPATH is not used as documented (i.e. the config is working as intended).
You should be able to use PYTHONPATH if you set isolated=0 and use_environment=1.
You should be able to use PYTHONPATH if you set isolated=0 and use_environment=1.
Confirmed, thanks @vstinner! (Previously I only tried use_environment=1 without also setting isolated=0.)
With that figured out, this PR boils down to just adding two lines in embed.h (I just need to remove the debug printf code).
But I also want to add a unit test that fails if PYTHONPATH is not used. I haven't worked on the embedding tests before, therefore that's a little adventure for me.
Confirmed, thanks @vstinner!
Nice! Maybe the effect of isolated=1 is not well explained in the doc.
- https://docs.python.org/dev/c-api/init_config.html#c.PyConfig.isolated
- https://docs.python.org/dev/c-api/init_config.html#init-isolated-conf
@Skylion007 @henryiii
This (small) PR is ready for review now.
I like the final change, short and simple :-D
Congrats for fixing your issue!