convert module initialization to use multi-phase-init
Part of #2274, inspired by all of #2245, #4081, and #4162
This converts the module initialization itself to use module slots and multi-phase init APIs. Module state is deliberately not covered here and left for follow ups. I also don't change anything related to subinterpreter checking, that is also left for a follow up.
Note to self: this might need to be opt-out (for a few versions) and be mentioned in the migration guide.
I like this very much!
Is this compatible with how we are currently putting pyclass' type objects in statics? We should figure out a way to store them in module state if possible.
I think it's sorta fine to leave the type objects in statics for now given we don't support subinterpreters, but yes:
- we should seek to use module state if we can anyway as a followup
- and we'd need to do so for any potential eventual subinterpreter support
I suppose eventually we'll want to allow users to have their own module state as well for their statics :-/
I think it's sorta fine to leave the type objects in statics for now given we don't support subinterpreters, but yes:
* we should seek to use module state if we can anyway as a followup * and we'd need to do so for any potential eventual subinterpreter support
This will also put the nail in the coffin for https://github.com/PyO3/pyo3/issues/1444. I'm fine with that personally, but it'll be disappointing for quite a lot of people. Maybe we should start thinking about other ways to address that use case.
I agree, I think there should be ways to take inspiration from how pybind11 makes that possible, with the caveat that Rust's lack of stable ABI makes things more tricky.
RE this PR - it looks like PyPy does not support all the APIs for multi phase init of the submodules, I will see if there's a way I can make that work with some conditional code. (🤮)
We should also ping the PyPy folks (h/t @mattip) with the APIs that are missing so that we can eventually remove the special cases.
PyPy does not support all the APIs for multi phase init of the submodules
I thought PyPy does support multi phase init, which APIs did I miss?
EDIT: I see a missing PyModule_ExecDef here but that should be PyPyModule_ExecDef. Maybe there needs to be a #[cfg_attr(PyPy, link_name = "PyPyModule_ExecDef")] decorator?
Yep PyModule_ExecDef our definition is wrong, but we also need PyModule_FromDefAndSpec2 which I couldn't see in the pypy exports.
I implemented the missing PyModule_FromDefAndSpec2 and PyModule_FromDefAndSpec on latest pypy-3.11 HEAD. With that, I will point out that cython does not use these low-level interfaces, rather it uses PyModuleDef_Init. Here is a small example in C that demonstrates a multiphase module with PyModuleDef_Init from the PyPy tests
C code using multiphase initialization
#include <Python.h>
static PyModuleDef multiphase_def;
static PyObject* multiphase_create(PyObject *spec, PyModuleDef *def) {
PyObject *module = PyModule_New("altname");
PyObject_SetAttrString(module, "create_spec", spec);
PyObject_SetAttrString(module, "create_def_eq",
PyBool_FromLong(def == &multiphase_def));
return module;
}
static int multiphase_exec(PyObject* module) {
Py_INCREF(Py_True);
PyObject_SetAttrString(module, "exec_called", Py_True);
return 0;
}
static PyModuleDef_Slot multiphase_slots[] = {
{Py_mod_create, multiphase_create},
{Py_mod_exec, multiphase_exec},
{0, NULL}
};
static PyModuleDef multiphase_def = {
PyModuleDef_HEAD_INIT, /* m_base */
"multiphase", /* m_name */
"example docstring", /* m_doc */
0, /* m_size */
NULL, /* m_methods */
multiphase_slots, /* m_slots */
NULL, /* m_traverse */
NULL, /* m_clear */
NULL, /* m_free */
};
PyMODINIT_FUNC
PyInit_multiphase(void) {
return PyModuleDef_Init(&multiphase_def);
}
testing code
import multiphase
assert multiphase.create_spec
assert multiphase.create_spec is multiphase.__spec__
assert multiphase.create_def_eq
assert multiphase.exec_called
Thanks, that's perfect, I will seek to retest with pypy nightly.
I will point out that cython does not use these low-level interfaces, rather it uses
PyModuleDef_Init.
Indeed, we primarily use that here too, but there are other code paths which directly create module objects from the definitions now written as multi-phase (e.g. submodule objects added to a root module during the exec phase)
Cool.
I will seek to retest with pypy nightly.
Sorry, I was a bit too late for last night's build, it will only appear tomorrow.
Superseded by #5525