pybind11
pybind11 copied to clipboard
Not clear how to expose existing C++ vector as numpy array
This is a question of documentation rather than an issue. I can't find any example of the following very common scenario:
std::vector<int> some_func();
...
// We want to expose returned std::vector as a numpy array without copying
m.def("some_func", []() -> py::array {
auto data = some_func();
// What to do with data?? Map it with Eigen (then return what?), wrap somehow with py::buffer (how?)
})
I don't know the answer. It would be very nice to have this explained in docs since this scenario if rather common.
py::array will automatically copy your data if you don't give it a base argument in the constructor (though maybe that's indeed not very well documented).
If you don't want to copy, one solution would be to move the std::vector into a py::capsule and use that capsule as the base for a new py::array, then just continue using the v.data() of that moved vector to construct. If I'm not mistaken, the returned py::array will then keep that capsule alive and delete the vector once capsule can be and is garbage collected.
Untested code, but this should be the implementation of the non-copy approach:
auto v = new std::vector<int>(some_func());
auto capsule = py::capsule(v, [](void *v) { delete reinterpret_cast<std::vector<int>*>(v); });
return py::array(v->size(), v->data(), capsule);
Yes, probably more black magic than you might expect. But then again, you're not doing something simple either. You are keeping a C++ object alive to make sure you can access its internal data safely, without leaking the memory.
But if you don't mind the copy, just go:
auto v = some_func();
return py::array(v.size(), v.data());
@YannickJadoul, thank you very it really works! I don't mind doing black magic (and the magic is in fact quite logical) but currently the user is not even aware that this kind of magic exists. Are there any plans to document the usage of py::array and py::capsule? The constructors of these types are non-trivial and usage of base argument is, well, a bit arcane.
Another suggestion. Probably it makes sense to provide an easy non-copying conversion from any contiguous buffer to py::arrray? Something like:
auto v = new std::vector<int>(some_func());
py::array array_from_buffer<int>(v, int ndim, shape, strides);
which will create corresponding py::buffer_info and capsule internally? Could be a great addition in the cases when numerical data have to be returned, especially if one needs to wrap the function like:
void some_func(vector<int>& val1, vector<vector<float>>& val2);
Manual wrapping of each argument with py::buffer, py::capsule into the py::array becomes tedious in such cases.
but currently the user is not even aware that this kind of magic exists.
Agreed, I had to look into the actual headers to check the exact constructors, etc. But I don't know about planned documentation updates. If you feel like it, I'm sure a PR with more documentation on this would be gladly accepted ;-) Then again, I'm not always sure what's stable API and what're implementation details.
Probably it makes sense to provide an easy non-copying conversion from any contiguous buffer to py::arrray?
Not sure how easy that is to do (and how much more confusing this will make the whole situation). Maybe some kind of a static function as 'named constructor' could make sense, though?
By the way, std::vector<std::vector<int>> is not a contiguous structure. And I don't think this technique works when (un)wrapping the arguments of a function. What I just described was a way of not copying a return std::vector.
Sure, it won't work with "input" function parameters but works for "output" when one transforms c++ signature to python function returning tuple of numpy arrays instead of bunch of ref parameters (tht's exactly my case). In any case such thing should not be automatic - the user have to make it explicit in lambda.
Vector
@YannickJadoul your code works, thanks for the reference, I just wanted to point out that you are missing a parenthesis at the end of the line
auto cap = py::capsule(v, [](void *v) { delete reinterpret_cast<std::vector<int>*>(v); });
By the way,
anybody knows, how to get py::array_t from std::shared_ptr<std::vector<T>> without copy (and using new/delete)?
I tried this:
std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
ptr->size(),
ptr->data(),
py::capsule(ptr.get(), [](void* p){ reinterpret_cast<decltype(ptr)*>(p)->reset(); }),
};
Obviously, this will never work, because when return happens, ptr will be deallocated from stack.
Using capture also does not help, because py::capsule can't accept them:
std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
ptr->size(),
ptr->data(),
py::capsule([ptr](){ }), // using lambda-capture to increase lifetime of ptr
};
Worked this solution (which seems very dirty):
std::shared_ptr<std::vector<float>> ptr = get_data();
return py::array_t<float>{
ptr->size(),
ptr->data(),
py::capsule(
new auto(ptr), // <- can leak
[](void* p){ delete reinterpret_cast<decltype(ptr)*>(p); }
)
};
@arquolo Indeed, the only data that can be stored in a py::capsule is a single void * and a simple function pointer (this is a Python C API thing, by the way; pybind11 just made a C++ wrapper around it). So if you want the capsule to be a (co-)owner of the shared_ptr, I would think that the last solution is the only one that works and stores the actual shared_ptr object.
Is it that dirty, though? In the end, a capsule taking a std::function (or any kind of lambda/functor object) would incur this same allocation (inside of the std::function) because of the variable size of the capture.
The one thing to note, though, is that the object doesn't need to be a capsule. I can just as well be any other object (though hopefully one that keeps the data alive), so if your shared_ptr would be stored as member in a C++ class that is exposed to Python, you could also take use that py::object.
We define the following utility functions, which have proven to be live savers :)
template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence&& seq) {
// Move entire object to heap (Ensure is moveable!). Memory handled via Python capsule
Sequence* seq_ptr = new Sequence(std::move(seq));
auto capsule = py::capsule(seq_ptr, [](void* p) { delete reinterpret_cast<Sequence*>(p); });
return py::array(seq_ptr->size(), // shape of array
seq_ptr->data(), // c-style contiguous strides for Sequence
capsule // numpy array references this parent
);
}
and the copy version
template <typename Sequence>
inline py::array_t<typename Sequence::value_type> to_pyarray(const Sequence& seq) {
return py::array(seq.size(), seq.data());
}
Thanks @ferdonline However, the move-helper needs to change signature to:
template <typename Sequence,
typename = std::enable_if_t<std::is_rvalue_reference_v<Sequence&&>>>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence&& seq)
With such fix, the compiler will warn you if you calls with without std::move
With such fix, the compiler will warn you if you calls with without std::move
@arquolo If you call without std::move, it will bind as an L-value reference and then inside it does the std::move anyway. IMHO that's a fine behavior.
@arquolo If you call without std::move, it will bind as an L-value reference and then inside it does the
std::moveanyway. IMHO that's a fine behavior.
You will destroy the original container, then, though. That's quite unexpected if you didn't call the container with an rvalue reference.
Isn't the standard solution to use std::forward<Sequence>(seq)? In that case you'll copy if you pass an lvalue reference, and you'll move if you get an rvalue or rvalue reference.
The function is called as_array and the "docs" say it will move, so I think it's fine, but you choose.
It's standard to use std::forward in case you want to pass on the same reference type. Here we don't care, we just want to transform whatever reference type to an rvalue reference.
By the way, anybody knows, how to get py::array_t from std::shared_ptr<std::vector<T>> without copy (and using new/delete)?
@arquolo , you might be interested in what I have found: https://github.com/pybind/pybind11/issues/323#issuecomment-575717041
If anyone's interested in a version of @ferdonline's utility function without explicit/manual new and delete:
template <typename Sequence>
inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) {
auto size = seq.size();
auto data = seq.data();
std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq));
auto capsule = py::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence*>(p)); });
seq_ptr.release();
return py::array(size, data, capsule);
}
Apart from avoiding new and delete, this also does not leak if for some reason py::capsule would throw.
@YannickJadoul
template <typename Sequence> inline py::array_t<typename Sequence::value_type> as_pyarray(Sequence &&seq) { auto size = seq.size(); auto data = seq.data(); std::unique_ptr<Sequence> seq_ptr = std::make_unique<Sequence>(std::move(seq)); auto capsule = py::capsule(seq_ptr.get(), [](void *p) { std::unique_ptr<Sequence>(reinterpret_cast<Sequence*>(p)); }); seq_ptr.release(); return py::array(size, data, capsule); }Apart from avoiding
newanddelete, this also does not leak if for some reasonpy::capsulewould throw.
I'm not sure this would work?
The memory would be freed early as there is nothing left to hold onto the heap allocation after the unique_ptr goes out of scope.
Then another heap allocation could grab the same memory, and new writes could corrupt what is already there (i.e. the numpy buffer we just returned). See https://www.cplusplus.com/reference/memory/unique_ptr/get/.
@YannickJadoul This is what I am using:
/**
* \brief Returns py:array<T> from vector<T>. Efficient as zero-copy.
* - Uses std::move to obtain ownership of said vector and transfer everything to the heap.
* - Only accepts parameter using std::move(...), or else the vector metadata on the stack will go out of scope (heap data will always be fine).
* \tparam T Type.
* \param passthrough Numpy array.
* \return py::array_t<T> with a clean and safe reference to contents of Numpy array.
*/
template<typename T>
inline py::array_t<T> toPyArray(std::vector<T>&& passthrough)
{
// Pass result back to Python.
// Ref: https://stackoverflow.com/questions/54876346/pybind11-and-stdvector-how-to-free-data-using-capsules
auto* transferToHeapGetRawPtr = new std::vector<T>(std::move(passthrough));
// At this point, transferToHeapGetRawPtr is a raw pointer to an object on the heap. No unique_ptr or shared_ptr, it will have to be freed with delete to avoid a memory leak.
// Alternate implementation: use a shared_ptr or unique_ptr, but this appears to be more difficult to reason about as a raw pointer (void *) is involved - how does C++ know which destructor to call?
const py::capsule freeWhenDone(transferToHeapGetRawPtr, [](void *toFree) {
delete static_cast<std::vector<T> *>(toFree);
//fmt::print("Free memory."); // Within Python, clear memory to check free: sys.modules[__name__].__dict__.clear()
});
auto passthroughNumpy = py::array_t<T>(/*shape=*/{transferToHeapGetRawPtr->size()}, /*strides=*/{sizeof(T)}, /*ptr=*/transferToHeapGetRawPtr->data(), freeWhenDone);
return passthroughNumpy;
}
@sharpe5
The memory would be freed early as there is nothing left to hold onto the heap allocation after the
unique_ptrgoes out of scope.
That's why you call seq_ptr.release(), to release ownership of the pointer, right? (but only after you're certain the creation of the py::capsule worked) See https://en.cppreference.com/w/cpp/memory/unique_ptr/release
@YannickJadoul This is what I am using:
This seems quite similar (or the same?) to @ferdonline's utility function. As far as I can see, it will still leak memory when py::capsule throws, because there's nothing holding on to that raw pointer? But yes, it probably won't, and if it throws, something else is probably wrong, so it's fine enough to use.
Also, it uses raw new/delete, which is what I tried and managed to avoid with my fragment.
@YannickJadoul You are right, your code is absolutely correct.
I can't help but think that the content of the capsule function is just a very complicated way of calling delete. I greatly prefer modern C++ and smart pointers, but if there is (void *) in the middle it becomes more difficult to reason about the data flow (for me at least!). Either smart pointers up and down the entire stack, or not at all? It is tricky to choose the right level of abstraction, and sometimes if one abstracts too much the intent gets obscured.
I did not see @ferdonline's utility function initially (see above), the one I quoted was written from first principles. It's somewhat interesting that they are virtually identical :)
I can't help but think that the content of the capsule function is just a very complicated way of calling delete.
Yes, it definitely is, but it does have the advantage of covering the corner case of exceptions in py::capsule's constructor and applying the good practice of avoiding new and delete. I don't think it's that much more complicated,, so I just threw out that addition, if people want to use it. But do of course use what is most comfortable to you.
This issue has been resolved. @YannickJadoul has done a great job answering questions here. Further question are better suited for gitter.
I'm thinking. Maybe we can/should add a convenience function for this to pybind11, since it seems to be such a popular issue. I'll reopen to remind ourselves.
This seems to be a good place to use a memoryview for holding onto the buffer instead of a capsule? #2307 is useful for invalidating the buffer once it has been released.
Actually, I think I misunderstood the problem, never mind. A memoryview might be useful in some of these cases however.
For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.
It is blazingly fast.
I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.
It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.
The zero-copy C++ >> Python adapter is in my post above.
This is the zero-copy Python >> C++ adapter:
/**
* \brief Returns span<T> from py:array_T<T>. Efficient as zero-copy.
* \tparam T Type.
* \param passthrough Numpy array.
* \return Span<T> that with a clean and safe reference to contents of Numpy array.
*/
template<class T=float32_t>
inline std::span<T> toSpan(const py::array_t<T>& passthrough)
{
py::buffer_info passthroughBuf = passthrough.request();
if (passthroughBuf.ndim != 1) {
throw std::runtime_error("Error. Number of dimensions must be one");
}
size_t length = passthroughBuf.shape[0];
T* passthroughPtr = static_cast<T*>(passthroughBuf.ptr);
std::span<T> passthroughSpan(passthroughPtr, length);
return passthroughSpan;
}
Hi, I would like to check whether the cleanup function is really called, so wrote the following code.
auto v = new std::vector<int>(some_func());
auto capsule = py::capsule(v, [](void *v) {
py::scoped_ostream_redirect output;
std::cout << "deleting int vector\n";
delete reinterpret_cast<std::vector<int>*>(v);
});
return py::array(v->size(), v->data(), capsule);
However, "deleting int vector" is not printed out when I run a python script. I even add the following python code at the end of the python script, but there was no use.
import gc
gc.collect(2)
gc.collect(1)
gc.collect(0)
Could you help me to make the cleanup function called explicitly?
Thank you
@tlsdmstn56-2 You need to delete the variable returned by the pybind11 module on the Python side, or else the memory will not be freed. py::array returns a zero-copy reference to the data, so the memory will be held on the C++ side until it is no longer needed on the Python side.
del my_variable
@sharpe5
For the record, I have a large Python module that has zero-copy communication between Python and C++ when working with columns in a DataFrame. It is zero-copy both ways, i.e. Python >> C++ and C++ >> Python.
It is blazingly fast.
I usually combine it with OpenMP or TBB to do multi-threaded calculations on the column data.
It is all in pybind11 and Modern C++ (except for one raw pointer reference which is wrapped in a function; see above). It's easily testable, when the function is called from C++ is accepts a templated vector, and when it is called from Python it accepts a templated span.
The zero-copy C++ >> Python adapter is in my post above.
This is the zero-copy Python >> C++ adapter:
/** * \brief Returns span<T> from py:array_T<T>. Efficient as zero-copy. * \tparam T Type. * \param passthrough Numpy array. * \return Span<T> that with a clean and safe reference to contents of Numpy array. */ template<class T=float32_t> inline std::span<T> toSpan(const py::array_t<T>& passthrough) { py::buffer_info passthroughBuf = passthrough.request(); if (passthroughBuf.ndim != 1) { throw std::runtime_error("Error. Number of dimensions must be one"); } size_t length = passthroughBuf.shape[0]; T* passthroughPtr = static_cast<T*>(passthroughBuf.ptr); std::span<T> passthroughSpan(passthroughPtr, length); return passthroughSpan; }
This is great for sharing the raw data, but how does it handle ownership? It looks like the short answer is that it doesn't, but maybe I'm missing something. Thanks!