pybind11 icon indicating copy to clipboard operation
pybind11 copied to clipboard

2.3.0 regression: <class 'bytes'> is not converted to std::vector<uint8_t> anymore

Open Vlad-Shcherbina opened this issue 6 years ago • 5 comments

To reproduce, create the following C++ extension:

#include <pybind11/pybind11.h>
#include <pybind11/stl.h>
#include <vector>
#include <stdint.h>

void take_bytes(const std::vector<uint8_t> &raw) {}

PYBIND11_MODULE(my_ext, m) {
    m.def("take_bytes", &take_bytes);
}

And invoke it from a Python script like this:

import my_ext
my_ext.take_bytes(b'hello')

Using pybind11 2.2.4, this compiles and runs without errors. Using pybind11 2.3.0, this compiles, but produces the following runtime error:

Traceback (most recent call last):
...
  File "hello.py", line 2, in <module>
    my_ext.take_bytes(b'hello')
TypeError: take_bytes(): incompatible function arguments. The following argument types are supported:
    1. (arg0: List[int]) -> None

This breaking change is not mentioned in the changelog or the upgrade guide.

Version info:

  • Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
  • Microsoft (R) C/C++ Optimizing Compiler Version 19.16.27026.1 for x64
  • I'm running cl.exe with the /std:c++latest or /std:c++17 flag

Vlad-Shcherbina avatar Jun 16 '19 19:06 Vlad-Shcherbina

This is an annoying one. Bytes -> vector<uint8_t> would be very natural.

jmgpeeters avatar Jan 29 '20 09:01 jmgpeeters

+1

Is there a workaround for this atm (or planned feature)?

axsaucedo avatar Nov 08 '20 10:11 axsaucedo

I have been able to use the workaround suggested in #2517, namely:

        .def("some_func", [](kp::SomeClass &self,
                                    py::bytes &bytes) {
                py::buffer_info info(py::buffer(bytes).request());
                const char *data = reinterpret_cast<const char *>(info.ptr);
                size_t length = static_cast<size_t>(info.size);
                self.someFunc(
                    std::vector<char>(data, data + length));

It seems like having a way to provide a conversion from Python bytes into std::vector<uint8_t> or std::vector<char> would still be qutie useful, as this has resulted in 5 different functions requiring this extra.

For my use-case I actually have a function that in cpp is defined as std::vector, but should be able to take a python string and a python bytearray. Because of this, it would be quite useful if this can be supported.

Is there a reason why this would not be desired? Would it be due to potential ambiguity? If so, what are the current ambiguous usecases?

axsaucedo avatar Nov 08 '20 11:11 axsaucedo

better to use a string_view or span as the native cast for bytes. zero copy, view only seems like the best fit, imo.

earonesty avatar Feb 09 '23 13:02 earonesty

better to use a string_view or span as the native cast for bytes. zero copy, view only seems like the best fit, imo.

Will it work "from the box" or anyway custom conversion required ?

IGR2014 avatar Oct 31 '24 12:10 IGR2014