pybind11
pybind11 copied to clipboard
[QUESTION] Numpy Array to C++ - take ownership of data
Is it possible to pass data from numpy to C++ and take ownership of the memory, so its no longer managed by Python? I have large Numpy matrix and I dont want to copy memory. I can use py::buffer_info and get pointer to the data, but the pointer is not valid when Python is shut down. Another reason is I want to release data from C++ side once I no longer need them.
If you don't want to copy use PYBIND11_MAKE_OPAQUE(). You can read about it here: https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#making-opaque-types
Is it possible to allocate the memory on the C++ side and pass it back to Python?
You can return a buffer from C++ to python like this:
return pybind11::buffer_info(...)
On the Python side, this return value can be used directly as a numpy array.
If you don't want to copy use
PYBIND11_MAKE_OPAQUE(). You can read about it here: https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#making-opaque-types
I am not quite sure, how to use this with numpy. Do you have some example?
In python I have a very simple example:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
and I want to pass arr to C++ so that C++ will "own" the data and data wont be freed when Python interpreter is finalized. Also, I dont want to create copy of data
Is it possible to allocate the memory on the C++ side and pass it back to Python?
You can return a buffer from C++ to python like this:
return pybind11::buffer_info(...)On the Python side, this return value can be used directly as a numpy array.
I cannot init memory in C++ via buffer_info, because I have numpy array as output from other library.
@MartinPerry I have the same exact question, did you manage to solve your issue?
@PierreMarchand20 Unfortunately, no
@MartinPerry @PierreMarchand20 I have the same exact question, did you manage to solve your issue?
I may have figured out a solution to this! It looks like the array pointer remains valid as long as the py::buffer_info object returned by buffer.request() exists. I've written a simple wrapper for transferring a 1D NumPy array into C++ (without copying) by storing the buffer_info object in an instance variable:
template<typename T>
struct PyArray {
py::buffer_info info;
T *data;
size_t size;
PyArray(py::array_t<T> arr) :
info { arr.request() },
data { static_cast<T*>(info.ptr) },
size { static_cast<size_t>(info.shape[0]) } {}
PyArray(const PyArray &) = delete;
PyArray &operator=(const PyArray &) = delete;
PyArray(PyArray&&) = default;
//...
Note that py::buffer_info is not copyable, so I had to delete the copy constructors and define the move constructor in PyArray. This does limit how PyArray can be used, but it should work as long as you always pass by reference.
I've tested this by creating a NumPy array in Python, using it to initialize a PyArray, deleting the original NumPy array, and then confirming that the PyArray still works. This same test fails if info is local to the constructor. I'm no expert in Python memory management, so I'm not 100% sure it this will work in all circumstances (e.g. when Python is "shut down"), but hope it helps!
I had similar problem. We typically use python in our C++ code base in the following way:
py::scoped_interpreter guard{};
py::dict locals;
py::exec(R"(python code here)", py::globals(), locals);
In order to move data from numpy ndarray calculated in python snippet to C++ I wrote the following function:
template <typename T>
arma::Col<T> MoveFromNumpyArray(pybind11::object obj) {
// Cannot use dynamic cast here because there are no virtual functions in
// pybind interface
auto np_array = static_cast<pybind11::array>(obj);
// In order to correctly extract data from numpy array its data type should be
// the same as T
assert(np_array.dtype() == pybind11::dtype::of<T>());
auto* data_ptr = static_cast<T*>(np_array.mutable_data());
assert(np_array.size() >= 0);
auto size = static_cast<arma::uword>(np_array.size());
np_array.release();
return {data_ptr, size, /*copy_aux_memory=*/false,
/*strict=*/false};
}
In my case common use of the function will be:
py::scoped_interpreter guard{};
py::dict locals;
py::exec(R"(
import numpy as np
x = np.array((1,2,3,4), dtype=int)
)", py::globals(), locals);
auto data = MoveFromNumpyArray<int>(local["x"]);
Therefore I need to cast from pybind11::object into pybind11::array and hope that user will pass numpy array as argument.
Also it is very important that user will specialize template to the type, which corresponds to numpy.ndarray.dtype.
arma::Col<T> in current example is basically std::vector<T>, which has constructor for building itself on top of the given pointer without any copy.
The solution is not ideal, because it relies on the user in two crucial things, but it works. At leas on my tests)
I may have figured out a solution to this! It looks like the array pointer remains valid as long as the
py::buffer_infoobject returned bybuffer.request()exists. I've written a simple wrapper for transferring a 1D NumPy array into C++ (without copying) by storing thebuffer_infoobject in an instance variable:template<typename T> struct PyArray { py::buffer_info info; T *data; size_t size; PyArray(py::array_t<T> arr) : info { arr.request() }, data { static_cast<T*>(info.ptr) }, size { static_cast<size_t>(info.shape[0]) } {} PyArray(const PyArray &) = delete; PyArray &operator=(const PyArray &) = delete; PyArray(PyArray&&) = default; //...Note that
py::buffer_infois not copyable, so I had to delete the copy constructors and define the move constructor inPyArray. This does limit how PyArray can be used, but it should work as long as you always pass by reference.I've tested this by creating a NumPy array in Python, using it to initialize a
PyArray, deleting the original NumPy array, and then confirming that thePyArraystill works. This same test fails ifinfois local to the constructor. I'm no expert in Python memory management, so I'm not 100% sure it this will work in all circumstances (e.g. when Python is "shut down"), but hope it helps!
Hello, Thanks for the method. I try this method but I make a new numpy array after delete the old one, PyArray::data is replaced by the new array. Have you tried this?
I have the same exact question. How can I malloc a buffer from c++ and use it On the Python side?