pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

feat: add #[pyo3(allow_threads)] to release the GIL in (async) functions

Open wyfo opened this issue 7 months ago • 16 comments

Relates to #1632

Draft based on #3609

wyfo avatar Nov 30 '23 02:11 wyfo

CodSpeed Performance Report

Merging #3610 will not alter performance

Comparing wyfo:allow_threads (08948df) with main (6c2e6f8)

Summary

✅ 69 untouched benchmarks

codspeed-hq[bot] avatar Nov 30 '23 03:11 codspeed-hq[bot]

@davidhewitt Here is the next one! I'm more explicit now :smile:

wyfo avatar Dec 07 '23 08:12 wyfo

@davidhewitt I have a few minutes now and the merge queue failed. If we could hold off on merging I could have a look as well?

adamreichold avatar Dec 09 '23 08:12 adamreichold

Sure thing, I'll leave this for you to review and merge 👍

davidhewitt avatar Dec 09 '23 08:12 davidhewitt

The issue with the merge queue was surely the async fn test added in gil.rs. I've annotated it with #[cfg(not(target_arch = "wasm32"))]

wyfo avatar Dec 09 '23 09:12 wyfo

One thing I only thought about now: How does this interact with async methods taking &self/&mut self. That should also be incompatible with allow_threads, shouldn't it?

adamreichold avatar Dec 09 '23 15:12 adamreichold

One thing I only thought about now: How does this interact with async methods taking &self/&mut self. That should also be incompatible with allow_threads, shouldn't it?

To add insult to injury, we should also consider how this would interact with #3646.

adamreichold avatar Dec 14 '23 12:12 adamreichold

One thing I only thought about now: How does this interact with async methods taking &self/&mut self. That should also be incompatible with allow_threads, shouldn't it?

I think this bit is ok, because the borrow on the PyCell still defends against concurrent modification. Something like this is already ok:

    let x: PyRef<Foo> = ...;
    let x_borrow: &Foo = &*x;
    py.allow_threads(|| x_borrow.do_something()); // as long as x is `Sync`, I think `x_borrow` is `Send`

(My code might be off but hopefully the point is communicated.)

To add insult to injury, we should also consider how this would interact with https://github.com/PyO3/pyo3/pull/3646.

I wonder if this would imply that the future needs to be polled on the worker thread? Hmm.

davidhewitt avatar Dec 14 '23 13:12 davidhewitt

(My code might be off but hopefully the point is communicated.)

Ah, that makes sense! The interior of the pyclass is not bound to the GIL, just getting access to it is. (Similar to how getting an ArrayView out of a NumPy array requires the GIL, but the view can then be process within allow_threads.)

I wonder if this would imply that the future needs to be polled on the worker thread? Hmm.

I think this would be the default outcome if both MR were merged as is. This should be fine, i.e. the Send bound already exists and hence allow_threads can soundly do this. I mainly wonder whether performance is show stopper for these short-lived closure which only want to poll the future once and then move on. Maybe this would require some deeper integration like having the future always live on the worker thread and just send a request to poll it or something like that.

adamreichold avatar Dec 14 '23 13:12 adamreichold

As said in #1632, i'm back! @adamreichold @davidhewitt

wyfo avatar Apr 06 '24 19:04 wyfo

I don't know what I can do for the nightly pipeline error 🤷‍♂️

error: non-local `impl` definition, they should be avoided as they go against expectation

wyfo avatar Apr 06 '24 19:04 wyfo

I don't know what I can do for the nightly pipeline error 🤷‍♂️

error: non-local `impl` definition, they should be avoided as they go against expectation

Feel free to allow the lint for now. You'll have to chase down where our code triggers this and apply #[allow(non_local_definitions)] (and a TODO) there.

It's a new nightly lint, it's not something you did. We have done work earlier to adapt our code to this lint, but this part is new. You can fix it if you want but the last time it was a substantial amount of work so it should be fixed in its own PR.

mejrs avatar Apr 06 '24 22:04 mejrs

You'll have to chase down where our code triggers this

That's the issue, it's not my code which is triggering the lint. My changes have nothing to do with the following error

error: non-local `impl` definition, they should be avoided as they go against expectation
   --> tests/test_proto_methods.rs:643:1
    |
643 | #[pymethods]
    | ^^^^^^^^^^^^
    |
    = help: move this `impl` block outside the of the current function `trampoline` and up 3 bodies

It's an issue of generated code (parts I haven't touched). My explaination is that a bug has been very recently introduced in the nightly build. By the way #4052 is failing too for the same reasons, while it's just a typo correction. There is definitely an issue with the nightly CI.

wyfo avatar Apr 07 '24 01:04 wyfo

You can also wait for someone to send a PR (probably me tomorrow) and rebase after it is merged.

mejrs avatar Apr 07 '24 16:04 mejrs

#4033 broke some code here, I will update the PR. Could I ask then for a new review? @adamreichold? I would like to unblock the async PRs to close the async chapter for the next release.

wyfo avatar Apr 18 '24 19:04 wyfo