rust-numpy Update pyo3 to 0.22.0 (without "py-clone")

This PR builds on #431 to migrate to PyO3 0.22.0 without needing to activate the problematic py-clone feature.

Background

PyO3 recently released version 0.22.0 which includes many new features, but also includes the next step in deprecating the gil-refs API by putting it behind the opt-in gil-refs feature flag. @bschoenmaeckers's PR went long way towards updating numpy for the migration by mirroring the gil-refs deprecation and feature-gating the methods that depend on it, but @juntyr raised valid concerns about its unconditional activation of PyO3's py-clone feature flag.

The problem with `py-clone`

The PyO3 0.22.0 release includes PyO3#4095's removal of the ability to clone Py<T> while the GIL is not held, which was implemented by removing its Clone impl and requiring users to either convert to Bound<T> before cloning or call the clone_ref method instead. The opt-in py-clone feature flag was added that restores the Clone implementation, but a panic is triggered if a clone is attempted while the GIL is not held. Enabling this feature may be convenient and fairly harmless in some use cases but can be hugely problematic in others, and we should not unconditionally impose the risk of panicking clones on all numpy users.

Why not just add our own `py-clone` feature?

This was suggested in #431, but has a major problem related to the Element impl for PyObject (aka Py<PyAny>). The problem is that Clone is a supertrait of Element, so PyObject losing its Clone impl when the py-clone feature is disabled means its Element impl must be disabled as well. So if we made the feature opt-in as was done by PyO3, then by default we would lose all support for object arrays which would be a huge regression. Making the feature opt-out instead is also a bad option, so we need another way.

Could we implement `Element` for `Bound<PyAny>` instead?

No. Unfortunately Bound<T> does not (and can not) implement Send, which is another requirement for implementing Element. So retaining the Element impl for PyObject is our only option for supporting object arrays without major reworks.

Eliminating our dependence on `py-clone`

The sad thing about this issue is that it really shouldn't be a problem in the context of numpy -- all of our API's require the GIL anyways, so it should always be safe to clone the PyObjects in our arrays. But that that doesn't help the fact that we still need a generic way to clone elements (and to_owned slices and array views), and we were relying on the Element trait's Clone requirement to do so. This PR solves the dilemma by replacing the Clone supertrait with a new trait PyClone, which can be safely implemented for PyObject and all existing Element impls.

The new `PyClone` trait

The new PyClone trait is a weaker form of the standard Clone trait that represents types which can be cloned while the GIL is held. Philosophically (though not in practice), this includes all Clone types that don't care about the GIL, as well as types like Py<T> that can only be cloned when it is held.

The trait has the following (simplified) definition:

pub trait PyClone: Sized {
    /// Create a clone of the value while the GIL is guaranteed to be held.
    fn py_clone(&self, py: Python<'_>) -> Self;
}

Any type that implements Clone can trivially implement PyClone by ignoring the py argument and simply delegating to its clone method. Meanwhile, PyObject can implement this trait by using the provided Python token to call Py::clone_ref. The actual trait also defines a couple of methods (not shown above) for cloning collections, which have default implementations that types can override if there is a more efficient way to do so than simply mapping the py_clone method to each item.

Adding a trait like this to PyO3 was discussed in PyO3#4133, but it is not clear if/when this would be implemented. If such a trait ends up getting added to PyO3 we can later transition to using it, but I don't think we should hold up numpy's migration to 0.22.0 waiting for that to happen.

Drawbacks

Ideally we would provide a blanket impl of PyClone for all T: Clone and then a specific impl for PyObject, which would let us avoid the breaking change for anyone with custom Element impls that will now need to add a PyClone impl as well. Unfortunately, this is not allowed due to possible impl overlaps since the compiler asserts that PyO3 restoring the Clone impl for Py<T> should not be a breaking change. If PyClone trait was instead defined in the pyo3 crate then they could provide these impls, but it is unlikely that they would (as discussed in PyO3#4133) since it would prevent them from defining necessary generic impls for types like Option<T: PyClone>.

So, this will inevitably need to be a breaking change for downstream crates defining custom Element impls for their types. Luckily it extremely rare for crates to do this, and adding the PyClone impl is absolutely trivial in comparison to all the other changes needed to migrate away from the gil-refs API in the PyO3 0.21/0.22 updates anyways.

Patching `numpy` internals

As mentioned before, all of the numpy API's already require access to a Python token to prove that the GIL is held, so most of the necessary updates simply require replacing item.clone() calls with item.py_clone(py). The only exceptions are the calls to <[T] as ToOwned>::to_owned and ArrayView::to_owned (both of which require T: Clone), which now delegate to PyClone's provided methods discussed above. Delegating to these collection-specific methods is used instead of simply mapping the py_clone method to avoid losing out on the optimized to_owned implementations for Clone types.

Tasks

[x] Deprecate the gil-refs API
- [x] Add gil-refs feature guarding all usages of PyO3's gil-refs API (completed by @bschoenmaeckers)
- [x] Update doctests and examples to use the bound API instead of gil-refs
[x] Eliminate py-clone feature dependency
- [x] Add PyClone trait to replace Clone as a supertrait of Element
- [x] Add PyClone impls for all internal Element types
- [x] Update numpy internals to use PyClone instead of Clone
[ ] Documentation
- [x] Document the PyClone trait
- [ ] Update the Element docs to reference the newly required supertrait
- [ ] Create changelog and/or migration guide
[ ] Add tests -- the existing tests cover the changes pretty well, but there may be PyClone-specific tests worth adding.

Jul 14 '24 04:07 JRRudy1

@JRRudy1 is there still active work on this PR?

Jul 19 '24 23:07 EricLBuehler

@EricLBuehler all it really needs is a review and some minor updates to the docs, but I was hoping for some feedback from the devs before I bothered putting on the finishing touches in case they disagree with the approach and reject it outright. I'm not an official maintainer for the project so I can't really do much to move towards a release, so hopefully someone will get to it before too long...

Jul 20 '24 00:07 JRRudy1

Sounds good @JRRudy1! We're looking to use 0.22 in HF Tokenizers, so this would be super helpful.

@davidhewitt would you possibly be able to review this?

Jul 20 '24 01:07 EricLBuehler

I am not the usual rust-numpy maintainer at the moment; @adamreichold has been leading that, however I believe they have been busy recently.

So have I (new baby), so there's a few things blocked on me which I need to unstack first, however if we don't hear from @adamreichold within maybe a week then I will seek to get to reviewing this. 👍

Jul 20 '24 05:07 davidhewitt

So have I (new baby), so there's a few things blocked on me

Congratulations on the kid! Please take your time off, having a newborn is such an intense time where you should not add work stress on top. Take care of yourself and your family <3

Jul 20 '24 10:07 juntyr

So have I (new baby), so there's a few things blocked on me which I need to unstack first, however if we don't hear from @adamreichold within maybe a week then I will seek to get to reviewing this. 👍

Congratulations!

Jul 20 '24 10:07 EricLBuehler

Sorry for being off the radar for quite some time. I am trying to get back into the rotation now starting with rust-numpy and more specifically starting with the work to support NumPy 2 and then looking into the the two PR to bump pyo3 itself.

(If I understand things correctly, this one would supersede #431 and contains its changes? If so, could #431 be closed by its author @bschoenmaeckers if they agree?)

Jul 21 '24 12:07 adamreichold

Thanks for continuering on my work! You did a very nice job of removing the clone requirement, it is probably way better then I could do. I will close my MR and let’s continue on this MR.

If you may, please add Co-authored-by: Bas Schoenmaeckers <[email protected]> to the final merge commit. Thanks!

Jul 23 '24 01:07 bschoenmaeckers

Congratulations David! And no worries Adam, I haven't had any time for open-source work lately either. But I absolutely agree that @bschoenmaeckers deserves co-authorship on this :)

I just fixed the pyo3 version in the example crates that was causing the checks to fail, so they should hopefully pass now when @adamreichold gets a chance to approve another CI run.

Aug 01 '24 15:08 JRRudy1

I fixed the first check, but I don't quite understand the second... does anyone have a clue why cargo would be failing to find a compatible syn version for pyo3-macros-backend?

Aug 01 '24 23:08 JRRudy1

PyO3 updated MSRV to 1.63, so try updating CI to match here.

Aug 02 '24 08:08 davidhewitt

PyO3 updated MSRV to 1.63, so try updating CI to match here.

Thanks for the tip! I bumped the MSRV in Cargo.toml and don't see it referenced anywhere else (~~including the CI configs~~), but let me know if you are aware of anything else.

Edit: I finally found a rust version spec in ci.yml and updated it to the MSRV, hopefully it should work now...

Aug 02 '24 16:08 JRRudy1

Also @bschoenmaeckers I gave you write access to my fork in case you want to contribute any more changes (but no pressure to do so)

Aug 02 '24 17:08 JRRudy1

ndarray just released v0.16 - would it be easy to bump it in this PR to get the changes into v22 as well, or would it need a separate PR (if greater code changes are required)?

Aug 07 '24 07:08 juntyr

Would be nice to move this forward (ideally including the ndarray bump).

Aug 20 '24 10:08 djc

Hi @adamreichold @davidhewitt anything left to push this forward? thanks!

Aug 29 '24 23:08 sheldonrong

I think both myself and @adamreichold have been struggling for availability recently, sorry to all who have been waiting. I have some conflicting priority work to achieve in PyO3 (patch fix 0.22.3, upcoming Python 3.13 freethreaded support).

I am acutely aware that both rust-numpy and pyo3-asyncio need 0.22 releases, which is going to be important to unblock the ecosystem.

Here it looks like the strategy from @adamreichold seems to be to merge #429 first, and then circle back to this PR. It seems that the contributor to #429 hasn't replied to the there. So if someone else is willing to help update #429, by opening a PR which supersedes with an updated form as per @adamreichold's review comments from July 21st, I think that will be the next best step to move this forward.

I can potentially do this myself, although it may take me up to a few weeks depending on how much time I can find.

Aug 31 '24 13:08 davidhewitt

I think both myself and @adamreichold have been struggling for availability recently, sorry to all who have been waiting. I have some conflicting priority work to achieve in PyO3 (patch fix 0.22.3, upcoming Python 3.13 freethreaded support).

I am acutely aware that both rust-numpy and pyo3-asyncio need 0.22 releases, which is going to be important to unblock the ecosystem.

Thanks for the update. I'm sure I speak for others when I say we really appreciate your work on pyo3 and related libraries!

Here it looks like the strategy from @adamreichold seems to be to merge #429 first, and then circle back to this PR. It seems that the contributor to #429 hasn't replied to the there. So if someone else is willing to help update #429, by opening a PR which supersedes with an updated form as per @adamreichold's review comments from July 21st, I think that will be the next best step to move this forward.

I'll take a stab at this.

Aug 31 '24 15:08 maffoo

Hey guys, thx for the hard work. Any update on this?

Sep 29 '24 18:09 piotryordanov

Hey guys, thx for the hard work. Any update on this?

We're trying to get support for numpy2 merged first (#442), then we'll come back to this. We're having an issue with numpy2 support for 32-bit python builds on windows, once that is resolved it should be ready to go.

Sep 30 '24 15:09 maffoo

Hey @davidhewitt, thanks for taking a look! I can take care of this tonight/tomorrow depending on how smoothly the rebase goes.

One remaining question I have involves bumping the ndarray dependency to v0.16 as requested by @juntyr, which will likely require minimal, if any, code changes. Do you think that should happen in a separate PR, or could it be included in this one? I'd be inclined to include it and avoid the hoops of another PR, but maybe you would prefer to keep things more atomic.

Oct 04 '24 16:10 JRRudy1

The ndarray update just landed earlier in #439

Oct 04 '24 17:10 Icxolu

Awesome, thanks! We also just got the ndarray support done in #439 so a rebase hopefully solves that too!

Oct 04 '24 17:10 davidhewitt

(i.e. no ndarray changes should be needed in this PR)

Oct 04 '24 17:10 davidhewitt

I rebased the commits and made the requested changes. It all checks out locally but I'm sure CI will catch something; can you please approve the workflow when you get a chance?

Awesome, thanks! We also just got the ndarray support done in #439 so a rebase hopefully solves that too!

Oh perfect, I didn't see that PR!

Oct 05 '24 20:10 JRRudy1

Looks great, thanks! I also pushed a quick commit to revert some removals of deprecation messages (they will help nudge users off the GIL Refs). I thought that was more efficient than requesting the change and I had just a drop of time.

Let's see how CI goes now...

Oct 06 '24 06:10 davidhewitt

Thank you all for your work on this PR <3

Oct 06 '24 06:10 juntyr

CI all green! Coverage could be better but that's also taking a hit because of all the GIL Refs removal duplication. We can worry about that on the way to 0.23.

I'll merge now and try to find time to prep and release in the next couple days.

Thanks everyone who has helped here and with all the other upgrades for this 0.22 release!

Oct 06 '24 09:10 davidhewitt

Update pyo3 to 0.22.0 (without "py-clone")

Background

The problem with py-clone

Why not just add our own py-clone feature?

Could we implement Element for Bound<PyAny> instead?

Eliminating our dependence on py-clone

The new PyClone trait

Drawbacks

Patching numpy internals

Tasks

The problem with `py-clone`

Why not just add our own `py-clone` feature?

Could we implement `Element` for `Bound<PyAny>` instead?

Eliminating our dependence on `py-clone`

The new `PyClone` trait

Patching `numpy` internals