pyo3 icon indicating copy to clipboard operation
pyo3 copied to clipboard

Expand performance section of the guide

Open adamreichold opened this issue 2 years ago • 6 comments

  • [ ] Overhead of conversions (i.e. PyList vs Vec etc.)
  • [ ] String intern!
  • [ ] Vec<u8> becoming list[int] (and thus being very slow), usage of Cow<[u8]> as an alternative
  • [ ] #[pyo3(get)] deep-cloning non-Py data
  • [ ] dictionary dispatch, i.e. look-up based on type object identity

adamreichold avatar Jul 11 '23 19:07 adamreichold

A couple of extra ideas:

  • Link from performance section to parallelism section (just to help visibility)
  • .extract::<Foo>() cloning vs .extract::<PyRef<Foo>>() taking a reference (for #[pyclass])

davidhewitt avatar Jul 11 '23 20:07 davidhewitt

Error creation e.g. PyErrArguments, how to avoid lazy error construction etc.

davidhewitt avatar Jul 17 '23 21:07 davidhewitt

Also .extract() is massively slower when it fails than .downcast()

samuelcolvin avatar Jul 17 '23 21:07 samuelcolvin

@samuelcolvin we added that as the initial content of the performance section already 😉

https://pyo3.rs/main/performance#extract-versus-downcast

davidhewitt avatar Jul 17 '23 21:07 davidhewitt

Vec becoming list[int] (and thus being very slow), usage of Cow<[u8]> as an alternative

I am a bit confused about the following function/method:

#[pyfunction]
fn foo(a: Vec<u8>) -> Vec<u8> {
    a
}
def foo(a: bytes) -> bytes:...

But for the parameter a, this is its implementation, so it is indeed implemented in the following way, right?

def convert(a: bytes) -> list[int]:
    return list(a)

If my understanding is correct, then I think the documentation is misleading:

Python Rust Rust (Python-native)
bytes Vec<u8>, &[u8], Cow<[u8]> PyBytes

It makes users think that bytes can be converted to Vec<u8>(as bytes) in a "normal" way, but it actually converts to list[int].

Especially since Vec<u8> can be correctly converted to bytes when used as a return value, this exacerbates the misunderstanding.

WSH032 avatar Feb 21 '25 09:02 WSH032

Not anymore. We fixed this long standing footgun with the introduction of IntoPyObject.

But for the parameter a, this is its implementation, so it is indeed implemented in the following way, right?

Yes, FromPyObject still has the problem of treating every Vec<T> equally. I looked into a fix for this as part of the ongoing FromPyObject rework https://github.com/PyO3/pyo3/pull/4390#issuecomment-2480501756, but the IntoPyObject approach is currently not compatible with the direction we planned for FromPyObject. So for the time being this will probably stay as is. PR improving documentation around this are of course welcome 😄

Icxolu avatar Feb 23 '25 13:02 Icxolu

Yes, FromPyObject still has the problem of treating every Vec<T> equally. I looked into a fix for this as part of the ongoing FromPyObject rework #4390 (comment), but the IntoPyObject approach is currently not compatible with the direction we planned for FromPyObject. So for the time being this will probably stay as is. PR improving documentation around this are of course welcome 😄

If I understand correctly, #5244 v0.27 resolves this issue 🎉.

WSH032 avatar Oct 29 '25 12:10 WSH032

If I understand correctly, #5244 v0.27 resolves this issue 🎉.

Yes, we found a way to do a similar specialization for FromPyObject as well.

Icxolu avatar Oct 29 '25 16:10 Icxolu