update `extract_argument` to use Bound APIs
Builds upon #3681 to move our internal argument extraction mechanisms off the GIL Pool.
Only the last commit is new, and I don't think this is worth reviewing right until #3681 is in, after which I'll rebase and tidy this a bit further. Pushing now purely because I think this should hopefully demonstrate a tantalising performance speedup which we're so close to merging! 😂
CodSpeed Performance Report
Merging #3708 will improve performances by 13.02%
Comparing davidhewitt:extract-arg2 (101563c) with main (a15e4b1)
Summary
⚡ 2 improvements
✅ 65 untouched benchmarks
Benchmarks breakdown
| Benchmark | main |
davidhewitt:extract-arg2 |
Change | |
|---|---|---|---|---|
| ⚡ | test_simple_args_kwargs_rs |
35.3 µs | 31.9 µs | +10.64% |
| ⚡ | test_simple_kwargs_rs |
35 µs | 31 µs | +13.02% |
Hmm I'll investigate later why this hasn't yielded the speedup I've seen in #3606, there's likely some other part I need to merge first...
Ah, I think it's PyDict::new_bound which is missing, I'll open a PR for that once we've merged the feature proposed in #3707
The performance regressions with the ordered_X benchmarks are genuine and look like they're caused because part of the proc macros go through &PyCell<T> GIL Ref machinery, and the compatibility implementation of FromPyObject::extract_bound is hurting this.
Probably the right approach is to get this PR merged (well, needs #3784 reviewed and merged first) and also #3792. Then I think there will be enough pieces in place to deal with updating the proc macro internals in a follow-up.
This is getting closer to being reviewable, but I think first it was worth splitting out #3801 and #3802.
Thanks for the reviews! I've removed the PyArg newtype following #3858, which has made this implementation a fair bit nicer on the macro side. I'll try to find time to tidy the rest up later.
Thanks both, I'll aim to tidy this up tomorrow 👍
Thanks both, think we're finally there on this one!
Hmm I'll investigate later why this hasn't yielded the speedup I've seen in #3606, there's likely some other part I need to merge first...
What is the latest result btw?
https://github.com/PyO3/pyo3/pull/3708#issuecomment-1871521345, but I think it's quite likely that parts of the speedup already got merged now, given how many times this got split up.
I will probably analyze again some time and see if there's any further tricks to apply here, but for now I think this is good.