pybind11 icon indicating copy to clipboard operation
pybind11 copied to clipboard

pybind11 wrapped method segfaults if a function returns `unique_ptr<T>` when holder is `shared_ptr<T>`

Open EricCousineau-TRI opened this issue 8 years ago • 13 comments

Testing out alternatives w.r.t. #1132, I was testing out the behavior of something like this:

class A {
 public:
  A(int value)
      : value_(value) {}
  int value() const { return value_; }
 private:
  int value_{};
};

unique_ptr<A> create_instance() {
  return make_unique<A>(50);
}

shared_ptr<A> check_creation(py::function py_factory) {
  shared_ptr<A> obj;
  {
    py::object py_obj = py_factory();
    obj = py::cast<shared_ptr<A>>(py_obj);
    // ERROR: Segfault here. I believe it may be due to a double-free, since in one of the implicit `py::cast<>` calls, the move'd `unique_ptr<>` is destructed while still containing a reference.
  }
  return obj;
}

...

  py::class_<A, std::shared_ptr<A>>(m, "A")
    .def(py::init<int>())
    .def("value", &A::value);

  m.def("create_instance", &create_instance);
  m.def("check_creation", &check_creation);

...

  py::exec(R"(
factory = lambda: m.create_instance()
obj = m.check_creation(factory)
print(obj.value())
)");

Issue

It seems that move_only_holder_caster is not aggressive enough when calling get(), and it will double-free with a non-empty unique_ptr<T>: https://github.com/pybind/pybind11/blob/86e2ad4f77442c3350f9a2476650da6bee253c52/include/pybind11/cast.h#L1460-L1463 https://github.com/pybind/pybind11/blob/86e2ad4f77442c3350f9a2476650da6bee253c52/include/pybind11/cast.h#L1369-L1372

Potential Solution

I believe the simple solution would be to teach holder_helper<holder_type>::get(src) to use release() for holder_type&& when holder_type = unique_ptr<U, ...>.

Workaround

The simple workaround is to explicitly cast to shared_ptr<T> in C++, since it can be implicitly constructed from unique_ptr<T>. This sidesteps pybind11 destructing a non-empty but unused unique_ptr<T>:

shared_ptr<A> create_instance() {
  return make_unique<A>(50);
}

EricCousineau-TRI avatar Oct 10 '17 18:10 EricCousineau-TRI

Nevermind, issue just comes from assuming that type has the same holder_type: https://github.com/pybind/pybind11/blob/86e2ad4f77442c3350f9a2476650da6bee253c52/include/pybind11/pybind11.h#L1288-L1291

The cast to holder_type is erroneous since the existing_holder is actually of type unique_ptr<T>, but the type's holder is shared_ptr<T>.

EricCousineau-TRI avatar Oct 10 '17 19:10 EricCousineau-TRI

Has this issue been resolved? What is the progress here? I've just encountered this issue on version 2.2.4 and I must say it's pretty annoying.

JendaPlhak avatar Oct 03 '18 07:10 JendaPlhak

Is this still being worked on? -- Seems like a pretty fundamental thing to fix, considering the Effective Modern C++ advice highlighted here: https://stackoverflow.com/questions/37884728/does-c11-unique-ptr-and-shared-ptr-able-to-convert-to-each-others-type

I ran into the same issue with the latest pybind11 git (made my own reproducer: https://github.com/rwgk/issues/commit/eb0f2af88e4a93ac925b19732212488d8f8bac39#diff-992a0ae6ef862674387f1ddfd5b922ff).

rwgk avatar Nov 12 '19 20:11 rwgk

FTR Gitter discussion: https://gitter.im/pybind/Lobby?at=5ed1945db101510b2026d8e3

EricCousineau-TRI avatar Jun 21 '20 19:06 EricCousineau-TRI

If you're affected by this issue, please check out the smart_holder branch, it is solved there:

https://github.com/pybind/pybind11/blob/smart_holder/README_smart_holder.rst

rwgk avatar Apr 05 '21 21:04 rwgk

I just ran into this issue and took a while to work out what was causing it. I gather the smart_holder branch is still not likely to be merged anytime soon? In the meantime, the most reasonable work-around I can think of is to cast to a shared_ptr in bindings only, like .def("func", [](Object &o) { return std::shared_ptr<Object>(o.func()); }).

If the smart_holder fix is not imminent, would it at least be possible to have this issue a warning or error on compile instead of mysteriously segfaulting at runtime?

taranu avatar Jun 09 '22 17:06 taranu

@taranu If you can find a clean way to add in a static_assert, we would be happy to look at a PR.

Skylion007 avatar Jun 09 '22 17:06 Skylion007

Unfortunately I'm still too busy with the main project for which I created the smart_holder branch. But the smart_holder is a pure add-on, closely tracking master; more precisely, master + add-on. I recommend you simply drop in smart_holder instead of master. Everything should work as before, only you can then also include pybind11/smart_holder.h, add a macro for the classes for which it matters only, and use classh instead of class_.

would it at least be possible to have this issue a warning or error on compile instead of mysteriously segfaulting at runtime?

Not easily.

If you can find a clean way to add in a static_assert, we would be happy to look at a PR.

I think this can only be checked at runtime. I could be wrong, but every time I thought about it (in the past) that was my conclusion. The runtime information is either an internals change, or some kind of add-on like smart_holder.

rwgk avatar Jun 09 '22 17:06 rwgk

It seems this issue has been open for seven years now, would it be possible to add some mention of smart pointer bugs (this and #5058 at the least) somewhere in the documentation?

From this page,

pybind11 supports std::unique_ptr and std::shared_ptr right out of the box.

I just spent a couple of hours going round in circles trying to debug some pretty nasty crashes, that came up quite a way away from where I created the objects - and it looks like it was a simple double-free. The pointers immediately ticked off my brain, but, from this page,

pybind11 supports std::unique_ptr and std::shared_ptr right out of the box.

I'd explicitly not concentrated on debugging those as I remembered reading olenty of mentions of smart pointers, and previous projects where I've used some sort of smart-pointer with pretty good success, so assumed this was not likely to be the cause of the issue if they are well-supported within the library. In this case it seems pybind was treating the shared pointers I returned as if they were unique later and double-freeing when the python GC fired sporadically. Simply changing the return type I provide pybind from shared to unique pointers and the problems seems to have gone away thus far.

Overall, If I'm handing off a smart pointer to a mature library that has good support for them, I'd kinda expect expect everything to work, there to be pretty few corner cases and therefore that I don't need to worry about it. Certainly appreciate the complexities involved in something like this, but obviously under more normal circumstances, trying to cast a shared pointer to a unique one would be a very bad idea and a pretty fatal crash. So I feel if there are several open issues I can see on github, it would help to have some mention of them in the docs so at least I am aware that I may need to debug issues.

Also, just wondering, is there any need to manage multiple different holder types, if it's difficult to get the correct one when deleting? For custom pointer types, understand - but if choosing between unique and shared in the std library, I would've thought that shared pointer would fit more naturally with python's reference-counting semantics - would it be easier simply not to handle unique pointers internally? If pybind's ever been given hold of a unique pointer, then it has the sole ownership, so making a shared pointer out of this and handling that in whatever way internally is always well-defined, right? Whilst the reverse is most definitely not the case. So could we just default to using shared pointers when passing things back and forth?

stellarpower avatar Apr 17 '24 01:04 stellarpower

@stellarpower Did you already see my https://github.com/pybind/pybind11/issues/1138#issuecomment-813651125 comment here?

I've been keeping the smart_holder branch up-to-date.

rwgk avatar Apr 17 '24 03:04 rwgk

What are the current thoughts about merging the smart_holder branch back into master?

From an outside perspective, it looks like the situation on master is pretty bad, and the smart_holder branch seems to improve many things already.

Having these things on a branch is a bit of a pity, and makes things rather complicated:

  • From a user perspective, using an unreleased branch version may not be an option, if e.g. project policies require to only use official releases.
  • The issue tracker could contain bugs that apply to either master or smart_holder, which could get confusing.
  • Similarly for documentation, it is probably not feasible to maintain official documentation for both master and smart_holder.
  • The chances of having these long-standing bugs get fixed on master by some other approach don't look very high, right? So if it is the best option anyway, why not go for it?

bluenote10 avatar Apr 17 '24 08:04 bluenote10

@rwgk I did thanks, yeah - it was late last night, and it's not a problem for me for now just to use unique pointers; I think I just downloaded the latest release when I added the headers, but I could check that branch out.

I was about to ask the same, is there a particular reason this fork been merged yet? what bluenote10 is saying makes sense - in either case, I guess I just wanted to say, if it doesn't end up getting fixed in the master yet, I think it's at least worth a mention in the documentation. If longstanding bugs are present, that's not necessarily a deal-breaker, but I'd at least prefer to know about them when I start a project, so I can know when to work around.

stellarpower avatar Apr 17 '24 12:04 stellarpower

What are the current thoughts about merging the smart_holder branch back into master?

What has always been holding me back from pushing more that the smart_holder branch gets merged is that it comes with the PYBIND11_SMART_HOLDER_TYPE_CASTERS macro, which is a bit of a trap (if you use it in one extension module but forget to use it in another).

Note that the smart_holder code was added to support a big PyCLIF project, retooling the PyCLIF code generator to target pybind11, instead of the Python C API directly. That has been taking a lot longer than anticipated, for various reasons. I also needed a few more major additions to make pybind11 more suitable as a target for an automatic code generator. Those changes are actually in another branch/fork, https://github.com/google/pybind11k: I'm regularly merging pybind11 master into smart_holder, then smart_holder into pybind11k.

I'm "almost done" with the PyCLIF retooling. When I'm actually done, i.e. PyCLIF-pybind11 is running in production, I want to get rid of the PYBIND11_SMART_HOLDER_TYPE_CASTERS macro. Ideally I don't want to break compatibility with pybind11 master, but I'm not sure how exactly, or even if that's actually achievable, and how long it will take: at any step I have to satisfy many millions of tests, and fix any that break myself (which is the biggest reason why the PyCLIF retooling is taking much longer than anticipated). When smart_holder is the default, in pybind11k, I want to start making pybind11k releases.

I could totally use help! But it's a very involved project: type_caster_base and the smart_holder type_casters need to be more unified, or at least factored much better. It will need a bit of deep thought to figure out what is involved in maximizing cross-extension backward compatibility, or if it is better to give up on that idea.

If someone wants to get involved deeply, the type_caster_base / smart_holder unification-or-refactoring could also happen on the master branch. My experience with reviews of changes that other maintainers are not directly interested in has not been a positive one though (frequent radio silence).

rwgk avatar Apr 17 '24 15:04 rwgk