hpy
hpy copied to clipboard
Provide API for module state
The current state in CPython and its issues:
- user sets
m_sizeinPyModuleDef, runtime allocates the state for the user when creating the module - getting the module state in different situations:
- regular module function:
selfis already the module instance ->PyModule_GetState(self) - method on type: new calling convention that passes the declaring type ->
PyType_GetModuleState(type) - slot, such as
nb_add- calling convention cannot be changed, because both extensions and CPython call slots directly from C
- note that a legacy extension can get
PyTypeObject*pointer for an HPy type coming from another HPy extension and attempt to directly callnb_addon it - so the API is:
PyType_GetModuleByDef(Py_TYPE(self), &module_def);wheremodule_defis thePyModuleDef- this walks the MRO until it finds a type that belongs to given
PyModuleDef(such type must have been created with PyType_FromModuleAndSpec)
- this walks the MRO until it finds a type that belongs to given
- regular module function:
The issue with the last API is that one can import one module multiple times in one interpreter:
import mymod
backup = mymod
del sys.modules['mymod']
import mymod
mymod.MyType is backup.MyType # False
In such case PyModuleDef does not uniquely identify the module instance. An example that shows how this can lead to a bug is here: https://gist.github.com/steve-s/9dd4d7e4810bf4cb61302f049d03ecf5
CPython documentation: https://docs.python.org/3/howto/isolating-extensions.html
There is also a Discord conversation about this with CPython core.
Ideal API:
- provide calling convention for passing in the module state directly for regular module functions and methods on types
- change slots calling convention to always pass in the module state
- can that be a fundamental performance issue in the API? The module state would have to be retrieved from somewhere even if it is not needed. Can such retrieval (thread local read?) dominate over the execution of the slot itself?
- can we have multiple calling conventions for slots? Where would the calling convention identifier be stored? The slot pointer can be tagged with it. Would masking the tag out be a performance issue (probably fine?).
How implementation of such API can look like:
- change HPy slots calling convention
- the CPython trampoline would have to fish the module state from somewhere
PyType_GetModuleByDef(Py_TYPE(self), &module_def);walks the MRO -- to slow to be called in every slot trampoline- option: importing one module multiple times in one interpreter is initially not supported on CPython and the implementation on CPython would use something like
HPyGlobal(or thread local) to store the module state for the trampolines. Eventually if CPython implements HPy natively, it can pass in the right state. - still: the fact that other extensions may call the slot directly prevents us from allowing multiple calling conventions for a slot(?)
- change
HPyType_FromSpecto also take the module - side-note: we can provide declarative way to register types in
HPyModuleDef(advantages: declarative, fewer API calls to create and initialize the module, lessens the need for the "exec" slot in multiphase init):
HPyType_Spec MyType1Def = { ... }
HPyModuleDef def = {
.types = { MyType1Def, ..., NULL }
...
would automatically call HPyType_FromSpec and put the type into the module dict.
Idea for new/better approach:
Example usage:
HPyDef_STATE_METH(myabs, "myabs", module_state_t, HPyFunc_NOARGS)
HPy myabs_impl(HPyContext *ctx, module_state_t *state, HPy self) { ... }
// Slots:
HPyDef_STATE_SLOT(myadd, module_state_t, HPy_nb_add)
HPy myadd_impl(HPyContext *ctx, module_state_t *state, HPy self, HPy other) { ... }
sketch of generated code for a method:
// prototype to be implemented by the user:
HPy myabs_impl(HPyContext *ctx, module_state_t *state, HPy self);
// simple trampoline that Python interpreter should call
HPy myabs_state_trampoline(HPyContext *ctx, void *state, HPy self) {
return myabs_impl(ctx, (module_state_t*) state, self);
}
// CPython trampoline if this is attached to a module
PyObject* myabs_trampoline(PyObject* self) {
return myabs_state_trampoline(ctx, PyModule_GetState(self), _py2h(self));
}
// CPython trampoline if this is attached to type
PyObject* myabs_type_trampoline(PyObject* self, PyTypeObject *defining_class) {
return myabs_state_trampoline(ctx, PyType_GetModuleState(defining_class), _py2h(self));
}
sketch of generated code for a slot:
// prototype to be implemented by the user:
HPy myadd_impl(HPyContext *ctx, module_state_t *state, HPy self, HPy other);
// simple trampoline that Python interpreter should call
HPy myadd_state_impl(HPyContext *ctx, void *state, HPy self, HPy other) {
return myadd_impl(ctx, (module_state_t*) state, self, other);
}
HPyDef myadd = { ...
void* _data; // reserved for the runtime, will be filled in HPyType_FromSpec
}
// CPython trampoline for the slot
PyObject* myadd_trampoline(PyObject* self, PyObject* other) {
PyObject *mod = PyType_GetModuleByDef(Py_TYPE(self), (PyModuleDef*) myadd._data);
return myadd_state_impl(ctx, PyModule_GetState(mod), _py2h(self), _py2h(other));
}
Relevant CPython docs:
- https://docs.python.org/3/howto/isolating-extensions.html#module-state-access-from-regular-methods
- https://docs.python.org/3/howto/isolating-extensions.html#module-state-access-from-slot-methods-getters-and-setters
Suggestions for better name for the macros? We cannot overload HPyDef_METH because it takes varargs. We can overload HPyDef_SLOT, but both macros should be consistent ideally.
Additionally, we either need to add HPyType_FromSpecAndModule, or, which I would be inclined to, HPyType_FromSpec will always take a module (we need to stash its def to _data in the generated code above).
Open question: I would suggest not exposing any further API to access the state. If you need it, use appropriate convention. I think the more is declarative and carried out the Python interpreter, the better. Like the removal of HPyModule_Create with the multi-phase init.