Expand performance section of the guide
- [ ] Overhead of conversions (i.e.
PyListvsVecetc.) - [ ] String
intern! - [ ]
Vec<u8>becominglist[int](and thus being very slow), usage ofCow<[u8]>as an alternative - [ ]
#[pyo3(get)]deep-cloning non-Pydata - [ ] dictionary dispatch, i.e. look-up based on type object identity
A couple of extra ideas:
- Link from performance section to parallelism section (just to help visibility)
.extract::<Foo>()cloning vs.extract::<PyRef<Foo>>()taking a reference (for#[pyclass])
Error creation e.g. PyErrArguments, how to avoid lazy error construction etc.
Also .extract() is massively slower when it fails than .downcast()
@samuelcolvin we added that as the initial content of the performance section already 😉
https://pyo3.rs/main/performance#extract-versus-downcast
Vec
becoming list[int] (and thus being very slow), usage of Cow<[u8]> as an alternative
I am a bit confused about the following function/method:
#[pyfunction]
fn foo(a: Vec<u8>) -> Vec<u8> {
a
}
def foo(a: bytes) -> bytes:...
- Its return value is indeed
bytes, so there is no problem as mentioned above, right?
But for the parameter a, this is its implementation, so it is indeed implemented in the following way, right?
def convert(a: bytes) -> list[int]:
return list(a)
If my understanding is correct, then I think the documentation is misleading:
| Python | Rust | Rust (Python-native) |
|---|---|---|
bytes |
Vec<u8>, &[u8], Cow<[u8]> |
PyBytes |
It makes users think that bytes can be converted to Vec<u8>(as bytes) in a "normal" way, but it actually converts to list[int].
Especially since Vec<u8> can be correctly converted to bytes when used as a return value, this exacerbates the misunderstanding.
- Its return value is indeed
bytes, so there is no problem as mentioned above, right?
Not anymore. We fixed this long standing footgun with the introduction of IntoPyObject.
But for the parameter
a, this is its implementation, so it is indeed implemented in the following way, right?
Yes, FromPyObject still has the problem of treating every Vec<T> equally. I looked into a fix for this as part of the ongoing FromPyObject rework https://github.com/PyO3/pyo3/pull/4390#issuecomment-2480501756, but the IntoPyObject approach is currently not compatible with the direction we planned for FromPyObject. So for the time being this will probably stay as is. PR improving documentation around this are of course welcome 😄
Yes,
FromPyObjectstill has the problem of treating everyVec<T>equally. I looked into a fix for this as part of the ongoingFromPyObjectrework #4390 (comment), but theIntoPyObjectapproach is currently not compatible with the direction we planned forFromPyObject. So for the time being this will probably stay as is. PR improving documentation around this are of course welcome 😄
If I understand correctly, #5244 v0.27 resolves this issue 🎉.
If I understand correctly, #5244 v0.27 resolves this issue 🎉.
Yes, we found a way to do a similar specialization for FromPyObject as well.