pybind11
pybind11 copied to clipboard
[BUG] Unbound methods should just be functions instead of instancemethod
Issue description
Objects of classes created with pybind11 can't be pretty printed with the pprint module if the result is larger than the given width. This only happens with python3.
It used to happen with cython too, but cython stopped using 'instancemethod', which solved the issue. The pprint bug was discarded because "Functions/methods should be immutable, so yes, I think it is a safe assumption that they should be hashable":
https://bugs.python.org/issue33395
Reproducible example code
example.cpp:
#include <pybind11/pybind11.h>
#include <string>
namespace py = pybind11;
struct Dummy
{
std::string text;
};
PYBIND11_MODULE(example, m) {
using namespace pybind11::literals;
py::class_<Dummy>(m, "Dummy")
.def(py::init<std::string>(), "text"_a)
.def_readonly("text", &Dummy::text)
.def("__str__", [](const Dummy& w) { return w.text; })
.def("__repr__", [](const Dummy& w) { return w.text; });
}
bug.py:
import example
import pprint
pprint.pprint(example.Dummy('abc'), width=3)
Result:
Traceback (most recent call last):
File "bug.py", line 4, in <module>
pprint.pprint(example.Dummy('abc'), width=2)
File "/usr/lib/python3.8/pprint.py", line 53, in pprint
printer.pprint(object)
File "/usr/lib/python3.8/pprint.py", line 148, in pprint
self._format(object, self._stream, 0, 0, {}, 0)
File "/usr/lib/python3.8/pprint.py", line 173, in _format
p = self._dispatch.get(type(object).__repr__, None)
TypeError: unhashable type: 'instancemethod'
That's very unfortunate. Seems that the __hash__ for instancemethod is still commented out. Not sure why; bpo-33395 doesn't seem to give a good answer?
I'm not sure how pybind11 is to blame for this bug, and not even how pybind11 could fix this, though. What did you have in mind?
Hi Yannick, since they stopped using instancemethod in cython, which solved the issue, maybe the same can be done in pybind11? (See https://github.com/cython/cython/pull/2105 which is mentioned in the pprint bug).
I don't know the details but I'm under the impression that instancemethod is obsolete for python 3 (e.g. https://stackoverflow.com/q/4364565).
@beatmax Hmmm, if that would somehow be efficient and not cause any other problems, it's something to consider, yes. I'm not sure if there was an explicit reason to go with instancemethod.
It might be nice to rephrase the title of your issue, then? This doesn't exactly make your issue clear.
It used to happen with cython too, but cython stopped using 'instancemethod', which solved the issue.
And replaced it with what exactly?
According to https://docs.python.org/3/c-api/method.html, the alternative is PyMethod_Type, but it's weird, because PyInstanceMethod_Type is the "new and improved" CPython3 thing. Also creating a new bound method is fun...
- In CPython2 it's
PyMethod_New(func, self, class); - In CPython3 it's
PyInstanceMethod_New(func);- Unless you want to use
PyMethod_New(func, self);
- Unless you want to use
Sometimes I really hate python...
Here is what they did: https://github.com/cython/cython/pull/2105/commits/a152462eba4463e4ab9292e76fb1a3e3b23aed64
Change the title to "Unbound methods should just be functions instead of instancemethod"? (based on the title of cython's PR) BTW it's been mentioned that also works for python 2 (I guess not for very old versions).
It would have been lovely if pybind11 could have just as easy a fix. Here's my first attempt:
diff --git a/include/pybind11/detail/class.h b/include/pybind11/detail/class.h
index 65dad5a..09d263d 100644
--- a/include/pybind11/detail/class.h
+++ b/include/pybind11/detail/class.h
@@ -158,7 +158,7 @@ extern "C" inline int pybind11_meta_setattro(PyObject* obj, PyObject* name, PyOb
*/
extern "C" inline PyObject *pybind11_meta_getattro(PyObject *obj, PyObject *name) {
PyObject *descr = _PyType_Lookup((PyTypeObject *) obj, name);
- if (descr && PyInstanceMethod_Check(descr)) {
+ if (descr && PyCFunction_Check(descr)) {
Py_INCREF(descr);
return descr;
}
diff --git a/include/pybind11/detail/common.h b/include/pybind11/detail/common.h
index 56385f5..cb2d05f 100644
--- a/include/pybind11/detail/common.h
+++ b/include/pybind11/detail/common.h
@@ -162,9 +162,9 @@
#include <type_traits>
#if PY_MAJOR_VERSION >= 3 /// Compatibility macros for various Python versions
-#define PYBIND11_INSTANCE_METHOD_NEW(ptr, class_) PyInstanceMethod_New(ptr)
-#define PYBIND11_INSTANCE_METHOD_CHECK PyInstanceMethod_Check
-#define PYBIND11_INSTANCE_METHOD_GET_FUNCTION PyInstanceMethod_GET_FUNCTION
+#define PYBIND11_INSTANCE_METHOD_NEW(ptr, class_) (Py_XINCREF(ptr), ptr)
+#define PYBIND11_INSTANCE_METHOD_CHECK PyCFunction_Check
+#define PYBIND11_INSTANCE_METHOD_GET_FUNCTION(func) func
#define PYBIND11_BYTES_CHECK PyBytes_Check
#define PYBIND11_BYTES_FROM_STRING PyBytes_FromString
#define PYBIND11_BYTES_FROM_STRING_AND_SIZE PyBytes_FromStringAndSize
@@ -192,7 +192,7 @@
#else
#define PYBIND11_INSTANCE_METHOD_NEW(ptr, class_) PyMethod_New(ptr, nullptr, class_)
#define PYBIND11_INSTANCE_METHOD_CHECK PyMethod_Check
-#define PYBIND11_INSTANCE_METHOD_GET_FUNCTION PyMethod_GET_FUNCTION
+#define PYBIND11_INSTANCE_METHOD_GET_FUNCTION(func) PyMethod_GET_FUNCTION(func)
#define PYBIND11_BYTES_CHECK PyString_Check
#define PYBIND11_BYTES_FROM_STRING PyString_FromString
#define PYBIND11_BYTES_FROM_STRING_AND_SIZE PyString_FromStringAndSize
diff --git a/include/pybind11/pytypes.h b/include/pybind11/pytypes.h
index 1010ad7..4c28458 100644
--- a/include/pybind11/pytypes.h
+++ b/include/pybind11/pytypes.h
@@ -466,8 +466,8 @@ PYBIND11_NAMESPACE_BEGIN(detail)
inline handle get_function(handle value) {
if (value) {
#if PY_MAJOR_VERSION >= 3
- if (PyInstanceMethod_Check(value.ptr()))
- value = PyInstanceMethod_GET_FUNCTION(value.ptr());
+ if (PyCFunction_Check(value.ptr()))
+ value = value;
else
#endif
if (PyMethod_Check(value.ptr()))
Basically, 3 things:
PyInstanceMethod_Newto the same thing cython did.PyInstanceMethod_ChecktoPyCFunction_CheckPYBIND11_INSTANCE_METHOD_GET_FUNCTION(func)to justfunc
That segfaults.
#0 0x00007ffff6a3c8eb in pybind11::detail::instance::get_value_and_holder (this=0x0, find_type=0x555555746850, throw_if_missing=false)
at /home/bstaletic/work/pybind11/include/pybind11/cast.h:328
#1 0x00007ffff6a43de9 in pybind11::cpp_function::dispatcher (self=0x7ffff74278a0, args_in=0x7ffff7588040, kwargs_in=0x0)
at /home/bstaletic/work/pybind11/include/pybind11/pybind11.h:509
#2 0x00007ffff7d389f3 in ?? () from /usr/lib/libpython3.9.so.1.0
#3 0x00007ffff7d37a9b in _PyObject_Call () from /usr/lib/libpython3.9.so.1.0
#4 0x00007ffff7d3379a in ?? () from /usr/lib/libpython3.9.so.1.0
#5 0x00007ffff7d1f2ab in ?? () from /usr/lib/libpython3.9.so.1.0
#6 0x00007ffff6a3f66f in pybind11::detail::pybind11_meta_call (type=0x555555746930, args=0x7ffff7588040, kwargs=0x0)
at /home/bstaletic/work/pybind11/include/pybind11/detail/class.h:175
#7 0x00007ffff7d1ed3d in _PyObject_MakeTpCall () from /usr/lib/libpython3.9.so.1.0
#8 0x00007ffff7d1aa82 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#9 0x00007ffff7d1489d in ?? () from /usr/lib/libpython3.9.so.1.0
#10 0x00007ffff7d14261 in _PyEval_EvalCodeWithName () from /usr/lib/libpython3.9.so.1.0
#11 0x00007ffff7dd81c3 in PyEval_EvalCode () from /usr/lib/libpython3.9.so.1.0
#12 0x00007ffff7de879d in ?? () from /usr/lib/libpython3.9.so.1.0
#13 0x00007ffff7de403b in ?? () from /usr/lib/libpython3.9.so.1.0
#14 0x00007ffff7cf65bf in ?? () from /usr/lib/libpython3.9.so.1.0
#15 0x00007ffff7cf6777 in PyRun_InteractiveLoopFlags () from /usr/lib/libpython3.9.so.1.0
#16 0x00007ffff7c84c97 in PyRun_AnyFileExFlags () from /usr/lib/libpython3.9.so.1.0
#17 0x00007ffff7c792bb in ?? () from /usr/lib/libpython3.9.so.1.0
#18 0x00007ffff7dcb259 in Py_BytesMain () from /usr/lib/libpython3.9.so.1.0
#19 0x00007ffff7a4b152 in __libc_start_main () from /usr/lib/libc.so.6
#20 0x000055555555504e in _start ()
As you can see, the this pointer is NULL.
It would have been lovely if pybind11 could have just as easy a fix. Here's my first attempt:
This is not going to work... A pybind11 function is actually a "closure" (or "function object") of a C function capturing a detail::function_record. Currently this is implemented as a bound PyCFunction whose self is set to a capsule of detail::function_record, and this has the side effect that we have to wrap it in an instancemethod in order to reinterpret it as a free method.
~~The best solution I can think of right now is to subclass PyCFunction: (code omitted)~~ Due to limitations of CPython I don't think it's possible to subclass PyCFunction, but I think a better idea might be to roll out our own pybind11_function. This will allow us to get rid of that extra instancemethod and simplify the destruction of function_record, which we can just perform by overriding __del__. It will also solve the __hash__ problem mentioned by @beatmax.
I've already been playing around with subclassing PyCFunction (and was waiting for a private discussion with/review from @bstaletic).
This is full of hack, but works:
diff --git a/include/pybind11/detail/class.h b/include/pybind11/detail/class.h
index 2f414e5c..778de079 100644
--- a/include/pybind11/detail/class.h
+++ b/include/pybind11/detail/class.h
@@ -41,6 +41,41 @@ inline PyTypeObject *type_incref(PyTypeObject *type) {
return type;
}
+inline PyObject *pybind11_function_descr_get(PyObject *self, PyObject *obj, PyObject *) {
+ if (obj == NULL) {
+ Py_INCREF(self);
+ return self;
+ }
+ else
+ return PyMethod_New(self, obj);
+}
+
+inline PyTypeObject *make_pybind11_function_type() {
+ constexpr auto *name = "pybind11_function";
+
+ auto heap_type = (PyHeapTypeObject *) PyType_Type.tp_alloc(&PyType_Type, 0);
+ if (!heap_type)
+ pybind11_fail("make_pybind11_function_type(): error allocating type!");
+
+ auto type = &heap_type->ht_type;
+ type->tp_name = name;
+ type->tp_base = type_incref(&PyCFunction_Type);
+ type->tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HEAPTYPE;
+ type->tp_setattro = PyObject_GenericSetAttr;
+ type->tp_descr_get = pybind11_function_descr_get;
+ type->tp_getset = PyCFunction_Type.tp_getset;
+
+ if (PyType_Ready(type) < 0)
+ pybind11_fail("make_pybind11_function_type(): failure in PyType_Ready()!");
+
+ return type;
+}
+
+inline PyTypeObject *get_pybind11_function_type() {
+ static auto type = make_pybind11_function_type();
+ return type;
+}
+
#if !defined(PYPY_VERSION)
/// `pybind11_static_property.__get__()`: Always pass the class instead of the instance.
diff --git a/include/pybind11/pybind11.h b/include/pybind11/pybind11.h
index 82003a90..404c9c1c 100644
--- a/include/pybind11/pybind11.h
+++ b/include/pybind11/pybind11.h
@@ -370,6 +370,7 @@ protected:
}
m_ptr = PyCFunction_NewEx(rec->def, rec_capsule.ptr(), scope_module.ptr());
+ m_ptr->ob_type = detail::get_pybind11_function_type();
if (!m_ptr)
pybind11_fail("cpp_function::cpp_function(): Could not allocate function object");
} else {
@@ -443,12 +444,12 @@ protected:
// Install docstring if it's non-empty (when at least one option is enabled)
func->m_ml->ml_doc = signatures.empty() ? nullptr : strdup(signatures.c_str());
- if (rec->is_method) {
- m_ptr = PYBIND11_INSTANCE_METHOD_NEW(m_ptr, rec->scope.ptr());
- if (!m_ptr)
- pybind11_fail("cpp_function::cpp_function(): Could not allocate instance method object");
- Py_DECREF(func);
- }
+ // if (rec->is_method) {
+ // m_ptr = PYBIND11_INSTANCE_METHOD_NEW(m_ptr, rec->scope.ptr());
+ // if (!m_ptr)
+ // pybind11_fail("cpp_function::cpp_function(): Could not allocate instance method object");
+ // Py_DECREF(func);
+ // }
}
/// When a cpp_function is GCed, release any memory allocated by pybind11
diff --git a/tests/test_constants_and_functions.py b/tests/test_constants_and_functions.py
index b980ccf1..b41ab439 100644
--- a/tests/test_constants_and_functions.py
+++ b/tests/test_constants_and_functions.py
@@ -40,3 +40,11 @@ def test_exception_specifiers():
assert m.f2(53) == 55
assert m.f3(86) == 89
assert m.f4(140) == 144
+
+
+def test_function_hash():
+ c = m.C()
+ assert hash(c.m1) != hash(c.m2)
+ assert hash(m.C.m1) != hash(m.C.m2)
+
+ assert hash(m.f1) != hash(m.f2)
But there's a lot of discussion to be have. In the first place whether adding yet another function type is a great idea, and whether subclassing is a good idea. Secondly whether we want to extend PyCFunction with the descriptor protocol functionality (to get bound methods when accessing them on an object), as shown here, or whether extending PyInstanceMethod to support tp_hash/__hash__ is a better plan.
Also, the self in PyCFunction isn't really the object of a bound method, I believe.
But given that pybind11 has had things this way since ages and complaints about hashing method are very rare, I don't believe this is currently very close to the top of the TODO list.
given that pybind11 has had things this way since ages and complaints about hashing method are very rare
I run into this hashing issue everytime an assertion involving a python object backed by pybind11 fails in a pytest unit test, because pytest tries to look the __repr__ up in a dict when describing what failed. So instead of "here's the involved objects" I get an unrelated error, and no information. Please fix it. It makes testing and debugging harder.
There is a related issue: functions created with cpp_function are not picklable. Note that PyCFunction does implement __reduce__, but then pickling of the PyCapsule object fails. This could be solved by defining a custom function type that defines __reduce__ to just return the full name of the function, for functions accessible from a module.
This does not solve the issue, but is a hack I use to get around this issue for pprinting. I'm putting it here in case anyone runs across this issue and needs something working ASAP for pprinting, etc.
I move the __repr__ to a class method, and then in Python dynamically add to the class. This means the __repr__ method is added in Python, so it's not an instance method. I've modified the original example to do this:
example.cpp
#include <pybind11/pybind11.h>
#include <string>
namespace py = pybind11;
struct Dummy
{
std::string text;
};
PYBIND11_MODULE(_example, m) {
using namespace pybind11::literals;
py::class_<Dummy>(m, "Dummy")
.def(py::init<std::string>(), "text"_a)
.def_readonly("text", &Dummy::text)
.def("__str__", [](const Dummy& w) { return w.text; });
m.def("__repr__Dummy", [](const Dummy& w) { return w.text; });
}
example.py
from _example import *
from _example import __repr__Dummy
Dummy.__repr__ = lambda self: __repr__Dummy(self)
bug.py
import example
import pprint
pprint.pprint(example.Dummy('abc'), width=3)
CMakeLists.txt
cmake_minimum_required(VERSION 3.25)
project(example)
set(PYBIND11_FINDPYTHON ON)
find_package(pybind11 REQUIRED CONFIG)
pybind11_add_module(_example example.cpp)
install(TARGETS _example DESTINATION .)
To install and test run:
cmake -B build -DCMAKE_INSTALL_PREFIX=. .
cmake --build build -j 10
cmake --install build
python bug.py