pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

Deprecate / remove `_Py` internal APIs from `pyo3-ffi`

Open davidhewitt opened this issue 1 year ago • 7 comments

The main outcome I got from my stream today was that @encukou and @vstinner (and presumably the rest of the Core Devs) want to regard CPython C API symbols prefixed with _Py as internal APIs which don't carry any version stability (not even necessarily in the same Python minor version patch series).

I think this means that we should consider removing all such definitions with this _Py prefix from pyo3-ffi, or at the very least making users think very hard before they choose to use these, if there's a good reason for any of these to be exposed in PyO3.

I took some notes on stream about where we use these definitions in PyO3 itself, they're on my desktop PC along with a small patch which removes use of a couple, I'll aim to push both tomorrow.

davidhewitt avatar Jan 26 '24 21:01 davidhewitt

So, excluding _Py_NewRef, which we now removed, these are the symbols in use inside PyO3 which we found while I was streaming:

  • [ ] _PyCFunctionFast / _PyCFunctionFastWithKeywords These are typedefs for function pointers to use with METH_FASTCALL. I think it's expected that these are stable API now, so I've opened https://github.com/python/cpython/pull/114627 to update these names (hopefully backwards-compatibly) in CPython. The PR got merged, so we can update PyO3 as part of updating to support 3.13 soon.

  • [x] _Py_InitializeMain This is used inside only inside a PyO3 test for PEP 587. I think to be honest the test predates pyo3-ffi-check, so it might be that we can just delete this test now.

  • [x] _PyLong_AsByteArray, _PyLong_FromByteArray, _PyLong_NumBits These are used to implement fast paths for conversion of Python integers to Rust "big" integers which are outside the range of C long long. While the measurement is dependent on the size of the integer, we found on-stream that there's as much as a 2x - 9x performance slowdown in PyO3's conversion routine for switching off this fast path. So if there's room to propose these APIs as candidates for stability upstream, it would be nice to keep using these.

  • [ ] _PyUnicode_COMPACT_DATA This is only used inside a PyO3 test, so I can clean that up trivially in a PR. However there's a related problem here that we rely on reading a C bitfield to make PyUnicode_DATA work. Nearly a year ago I opened https://github.com/python/cpython/pull/102589 to propose a (wrong) solution upstream to fix this, however @encukou already stated the correct solution on that PR and it's just waiting on me (or someone else) to complete that.

  • [x] _Py_c_sum and other PyComplex operations From talking to @encukou it sounds like these are not really planed to be stabilised any time soon. Probably we should rework our PyO3 implementation to not make these FFI calls and instead go via Python operator calls.

  • [x] _PySet_NextEntry This is used for iteration of set and frozenset when not(Py_LIMITED_API). However, @encukou pointed out to me that this might not do the right thing for set subclasses. Also, on-stream it seemed like the performance gain from using this API is also not huge (maybe ~7%). So I'm tempted to just open a PR to simplify PyO3 to remove this "fast-path".

davidhewitt avatar Jan 27 '24 21:01 davidhewitt

  • These are used to implement fast paths for conversion of Python integers to Rust "big" integers which are outside the range of C long long. While the measurement is dependent on the size of the integer, we found on-stream that there's as much as a 2x - 9x performance slowdown in PyO3's conversion routine for switching off this fast path.

I had a proposal for a "strncpy"-style function[^1] that would copy the bits of a Python integer into normal memory (or fail if you don't provide a big enough buffer). The nice thing about it is we could have an (unstable) inline version that can do fast copies for small ints, and be an optimal single API for all conversions, but I suspect even a DLL-exported version would still be as good as the range of functions we currently have.

Is that a design that would work well for PyO3? Or do you prefer to request the bit length, then allocate, then copy?

[^1]: Essentially PyLong_CopyBits(buffer, sizeof(buffer), int_object) and return the buffer size required if it doesn't fit

zooba avatar Jan 29 '24 13:01 zooba

@zooba I think that proposed API could work fine! We could use stack space for small ints, which could let us just make the one FFI call and skip allocation if it's small enough to succeed first try.

(I think the num_bigint crate will allocate for its BigInt anyway, but that's a separate concern.)

Whether you prefer to propose a new API or stabilise the existing APIs, I think we can make it work in PyO3 👍

davidhewitt avatar Jan 31 '24 10:01 davidhewitt

_PyLong_AsByteArray, _PyLong_FromByteArray, _PyLong_NumBits

It was discussed https://github.com/python/cpython/issues/111140 No clear API was proposed.

vstinner avatar Jan 31 '24 14:01 vstinner

I see that the integer-to-bytes conversions have now been added in https://github.com/python/cpython/pull/114886, thanks @zooba @encukou and all who worked to get that done! We'll be sure to switch for 3.13 👍

davidhewitt avatar Feb 17 '24 22:02 davidhewitt

@davidhewitt Do give a shout if you find it awkward in any way. We've still got time and freedom to make changes, and your case is very much the intended case, so we want it to work well (though I'm hoping we'll get some of our own improvements by using it internally too).

zooba avatar Feb 19 '24 17:02 zooba

Have addressed the _PyLong methods in #4184, it seems to work well, thanks @zooba!

davidhewitt avatar May 16 '24 18:05 davidhewitt