pybind11 icon indicating copy to clipboard operation
pybind11 copied to clipboard

Add NumPy Scalars

Open sun1638650145 opened this issue 2 years ago • 9 comments

Description

Compared with using py::buffer to provide processing py::numpy_scalar is more concise and easier to use. I think PR #2060 is a very good solution, but it seems to be abandoned. So I made a simple modification and resubmitted a new PR.

sun1638650145 avatar Dec 10 '21 08:12 sun1638650145

Logging my thoughts after looking for a bit.

That's a lot of code (cost). What's the benefit?

It makes sense to distinguish between ints of different sizes etc. for large arrays, but how often does it matter for scalars in real life? Passing a one-element array could do the trick, too. Probably a minor nuisance that's worth spending time and code on only if it becomes wide-spread clutter.

I will wait for other maintainers to chime in.

rwgk avatar Dec 10 '21 20:12 rwgk

@rwgk I understand your thoughts, thanks for your suggestions.

sun1638650145 avatar Dec 13 '21 03:12 sun1638650145

It is not sure whether the PR will be merged, so if you want to use this feature, you can directly use this patch.

sun1638650145 avatar Dec 27 '21 15:12 sun1638650145

I won't speak to the cost/benefit ratio. But I thought I could mention that I have found myself writing a small wrapper function like this:

template<typename T>
py::object create_numpy_scalar(T val) {
    // usage requires initialized NumPy C-API (call _imoprt_array() before use)
    py::object dt = py::dtype::of<T>();
    PyObject * scal = PyArray_Scalar(&val, (PyArray_Descr*)dt.ptr(), py::int_(sizeof(T)).ptr());
    return py::reinterpret_steal<py::object>(scal);
}

to wrap e.g. a long double scalar to be passed to a python function, e.g. def my_jacobian(some_scalar, some_vector, an_out_matrix). Of course, passing a ndarray with shape == (1,) as some_scalar and writing an extra line of code in my_jacobian to convert it to a numpy scalar (if needed for the semantics in that function) should work equally well I suppose?

bjodah avatar Feb 01 '22 07:02 bjodah

  1. The cost of 300 lines seems to be very high, but in fact most of them are repetitive code to support different precision int, float and complex (these macro definitions really can't use functions to reduce code measured).

  2. I understand the situation you mentioned. I used py::buffer to achieve the same function before, but the function implemented in this way has 3 obvious shortcomings.

    • When using the __doc__ function on the Python side, the data type is lost, and buffer is always displayed instead of e.g. np.int32.
    • Representing a scalar with an array of [1, 1] alone also makes the readability on the Python side worse.
    • The readability of the code on the C++ side becomes extremely low, and only a lot of comments can be written to explain why this is done.

@bjodah, thank you very very much. My original idea was to transfer this part of the cost directly from the user to us. Whether it is pybind or Python, I hope it is simple and easy to use. English is not my native language; please excuse typing errors.

sun1638650145 avatar Feb 01 '22 10:02 sun1638650145

That's a lot of code (cost). What's the benefit?

I have been bitten by this in the past as well and the current inconsistency for 0-dim data (scalars) makes generic ND data handling on API interfaces pretty wild/complicated.

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

ax3l avatar Feb 18 '22 20:02 ax3l

I'm personally fine with following numpy more closely (that is, adding features already supported by numpy, within reason).

henryiii avatar Feb 18 '22 21:02 henryiii

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

Thanks for pointing out @ax3l, I hadn't given that enough weight previously.

I see the CI is currently very unhappy (42 failing). @sun1638650145, after I see you got it back to green, I'll run the Google global testing with this PR. Could you please tag me with an CI All Green message or similar when it's ready?

rwgk avatar Feb 18 '22 21:02 rwgk

As long as this goes in the separate numpy.h header, I think better numpy scalar support is reasonable to add.

Thanks for pointing out @ax3l, I hadn't given that enough weight previously.

I see the CI is currently very unhappy (42 failing). @sun1638650145, after I see you got it back to green, I'll run the Google global testing with this PR. Could you please tag me with an CI All Green message or similar when it's ready?

OK, I'll try to fix the bug as soon as I can.

sun1638650145 avatar Feb 18 '22 23:02 sun1638650145