pybind11
pybind11 copied to clipboard
[BUG]: Potential leaks when using enum_
Required prerequisites
- [X] Make sure you've read the documentation. Your issue may be addressed there.
- [X] Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
- [ ] Consider asking first in the Gitter chat room or in a Discussion.
Problem description
Dear team, Asan reports possible memory leaks when wrapping enum from C++ (using py::enum_, precisely when a call to the value method is done.
Wrapping struct (using py::class_) seems to raise no leaks.
Config : python3.8 / gcc8.3, pybind v2.9.2
May be related to https://github.com/pybind/pybind11/issues/3228
Reproducible example code
#include <pybind11/pybind11.h>
enum Egg {
Bar,
Spam,
};
namespace py = pybind11;
PYBIND11_MODULE(example, m) {
py::enum_<Egg>(m, "Egg")
.value("Bar", Egg::Bar)
.value("Spam", Egg::Spam);
}
Likely causes include our abbusive use of attr to add values to the dictionary. The fact that we are storing a C++ pair in the python dictionary, or finally some bug with dynamic_attrs that we haven't caught. It could also be a reference cycle between the attr and the enum, in which case Python won't properly GC when the class is deleted.
Could you explain how do you know there is a leak?
Memory leaks or errors seems to be reported by your favorite memory checker (Totalview, valgrind, asan ...) if you build the pybind module with debugging options :
g++ -O0 -g -fsanitize=address -shared -std=c++11 -fPIC -I pybind11/include/ test.pybind.cpp -o example$(python3-config --extension-suffix)
According to the reports, the faulty lines seems to be
return unique_function_record(new detail::function_record());inpybind11.h::cpp_function::make_function_recordfunc->m_ml->ml_doc = signatures.empty() ? nullptr : PYBIND11_COMPAT_STRDUP(signatures.c_str());inpybind11.h::cpp_function::initialize_genericrec->def = new PyMethodDef();inpybind11.h::cpp_function::initialize_genericauto *t = PYBIND11_COMPAT_STRDUP(s);inpybind11.h::cpp_function::strdup_guard::operator ()
Context of call varies, for example here one of the stacks (sorry for readability !)
In test.pybind.cpp
.value("Bar", Egg::Bar)
In pybind11::enum_<Egg>::not-in-charge enum_<NULL> l.2179
attr("__setstate__") = cpp_function(
[](detail::value_and_holder &v_h, Scalar arg) {
detail::initimpl::setstate<Base>(
v_h, static_cast<Type>(arg), Py_TYPE(v_h.inst) != v_h.type->type);
},
detail::is_new_style_constructor(),
pybind11::name("__setstate__"),
is_method(*this),
arg("state"));
In pybind11::ccp_function (some long template here) l.100
cpp_function(Func &&f, const Extra &...extra) {
initialize(
std::forward<Func>(f), (detail::function_signature_t<Func> *) nullptr, extra...);
}
In pybind11::ccp_function::initialize (some long template here) l.177
auto *rec = unique_rec.get();
In pybind11::cpp_function::make_function_record l.162
return unique_function_record(new detail::function_record());
How did you get that? I did as follows
>>> import example
==9434==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
That's great, it means you compiled with gcc sanitizer sucessfully :) (that is the -fsanitize=address option, see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html). Now you have to provide the asan library at runtime, I do it with
LD_PRELOAD=/opt/tools/gcc-8.3/lib64/libasan.so python test.py
Be careful :
- The path to asan lib may will probably be different, it depends on your installation
- I think it is not possible to use the interpreter in this configuration to write a simple test.py importing the pybind module
- Sanitizer output is huge and hard to read ! In particular you will see lot of warning comming from python library itself, you have to search the ones comming from your library
- If you prefer to use valgrind, do not compile with
-fsanitize=address
I also show leaks with asan only for enum (every enum) in make_function_record()
@cjolivier01 Could this be related to the free_data function not being set correctly? A relevant PR seems like https://github.com/pybind/pybind11/pull/3229 which fixed a similar memory leak. Also could be related to the InitializingFunctionRecordDeleter in pybind11 perhaps.
We're also seeing these leaks with enums. Also, if you add the py::arithmetic tag, then an additional memory leak is generated.
After doing some more investigation, I am fairly sure it's due to use calling .attr() on enum_base which is just a handle of the object store in a separate struct of enum in a weird way. We probably need to refactor to get rid of the weird base struct in a so that the attributes are properly cleared. Something is keeping the attrs we add during construction from being 'dec_ref' properly