pybind11
pybind11 copied to clipboard
Demo: `return_value_policy::copy` ignored
Description
Demo: return_value_policy::copy ignored after a Python object was created already with return_value_policy::reference_internal. Back-ported from pywrapcc test_class_sh_property_stl.cpp,py
Suggested changelog entry:
Hi @wjakob, what do you think about the existing behavior? It's not very likely to be a problem in real life, but if it bites, it could be very surprising and extremely difficult to debug, especially in a large system.
See test_cpy_after_ref in test_property_stl.py. To see how that can bite in real life, swap these lines in test_persistent_holder:
c0 = h.vec_fld_hld_cpy[0] # Must be first. See test_cpy_after_ref().
r0 = h.vec_fld_hld_ref[0] # Must be second.
With those lines swapped the test fails:
Running tests in directory "/usr/local/google/home/rwgk/forked/pybind11/tests":
=========================================================== test session starts ============================================================
platform linux -- Python 3.10.9, pytest-7.1.2, pluggy-1.0.0
C++ Info: Debian Clang 14.0.6 C++17 __pybind11_internals_v4_clang_libstdcpp_cxxabi1002__ PYBIND11_SIMPLE_GIL_MANAGEMENT=False
rootdir: /usr/local/google/home/rwgk/forked/pybind11/tests, configfile: pytest.ini
plugins: xdist-3.0.2, flakefinder-1.1.0
collected 3 items
test_property_stl.py .F. [100%]
================================================================= FAILURES =================================================================
__________________________________________________________ test_persistent_holder __________________________________________________________
def test_persistent_holder():
h = m.VectorFieldHolder()
r0 = h.vec_fld_hld_ref[0] # Must be second.
c0 = h.vec_fld_hld_cpy[0] # Must be first. See test_cpy_after_ref().
assert c0.fld.wrapped_int == 300
assert r0.fld.wrapped_int == 300
h.reset_at(0, 400)
> assert c0.fld.wrapped_int == 300
E assert 400 == 300
E + where 400 = <pybind11_tests.property_stl.Field object at 0x7f0e8f7ccaf0>.wrapped_int
E + where <pybind11_tests.property_stl.Field object at 0x7f0e8f7ccaf0> = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>.fld
c0 = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>
h = <pybind11_tests.property_stl.VectorFieldHolder object at 0x7f0dca754f70>
r0 = <pybind11_tests.property_stl.FieldHolder object at 0x7f0dca754ff0>
test_property_stl.py:25: AssertionError
========================================================= short test summary info ==========================================================
FAILED test_property_stl.py::test_persistent_holder - assert 400 == 300
======================================================= 1 failed, 2 passed in 0.12s ========================================================
I think (but have not verified) that the existing behavior goes back to this
https://github.com/pybind/pybind11/blob/5bbcba548a27baf8d6bfb7993e814da24b432dbf/include/pybind11/detail/type_caster_base.h#L524-L526
pre-empting this
https://github.com/pybind/pybind11/blob/5bbcba548a27baf8d6bfb7993e814da24b432dbf/include/pybind11/detail/type_caster_base.h#L546-L562
Hi @rwgk — I agree, this is super-dangerous and something I did not think of when coding up the original logic a long time ago. I introduced a change here in nanobind by always prioritizing return_value_policy::copy and return_value_policy::move when the instance is already registered with the binding tool. They return a new object in that case. Can't see any major reasons against pybind11 doing the same thing besides existing software potentially having come to rely on the current bug-prone behavior.
This goes deeper by the way: what about the other return value policies, when returning an object that is already registered with the binding tool? All of them are currently ignored.
-
return_value_policy::reference: ignored, and that seems just fine. -
return_value_policy::take_ownership: no ownership will be taken. But should it? Perhaps the previous version of the object was returned viareturn_value_policy::reference. -
return_value_policy::reference_internal: This entangles the lifetime of the parent and child. If the instance already exists, then that doesn't happen. Might be problematic.
I don't do anything special for these other cases in nanobind, because I wasn't 100% sure what to do about them. It would be be good to decide on a way of resolving all of these situations, and then implementing & documenting them consistently in both binding tools.
Thanks Wenzel! That's very useful background information.
I don't do anything special for these other cases in nanobind, because I wasn't 100% sure what to do about them.
What I usually do in this situation is experimenting: run global testing with different ideas. Reviewing the results usually gives me a much more certainty than going with just thinking and guessing.
It might take me a while to get back here. The problem didn't come up repeatedly, and we're down to a very tiny fraction of failing tests, therefore I came to look at it as a fringe case. But it would definitely be good to get rid of this trap.
If you or anyone wants to try things out (changes in pybind11), please let me know. I can offer running global tests for experimentation. It's not a lot of work for me (only for the machines).