pyo3
pyo3 copied to clipboard
Support `Py_EnterRecursiveCall` and `Py_LeaveRecursiveCall`?
I'm struggling with some recursive code in pydantic-core, particularly related to wasm and pyodide, see https://github.com/samuelcolvin/pydantic-core/pull/167.
It was suggested I use Py_EnterRecursiveCall, is there any case where this would help? Is it something pyo3 could help/support?
I had assumed recursive code in rust avoided the stack and python's recursion limit even when working with python objects, but messing with setrecursionlimit is effecting when tests pass and fail in wasm.
To be honest, I'm out of my depth here, any guidance here would be appreciated. Thanks in advance.
recursive code in rust avoided the stack
Recursive Rust code definitely uses the stack - it's where all function locals and return addresses etc reside. e.g. here's the simplest stack overflow you could possibly hit: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=f47a031011c7aff7d62a65519ca66700
Not sure if this is helpful, if you really need to have unlimited recursion the way I tend to do it is transform recursive function calls into loops.
e.g.
enum RecursiveData {
Many(Vec<RecursiveData>),
One(String),
}
fn recursive_flatten(data: RecursiveData) -> Vec<String> {
match data {
RecursiveData::Many(datas) => datas.into_iter().map(recursive_flatten).flatten().collect(),
RecursiveData::One(string) => vec![string],
}
}
can become this loop version which is bounded only by the memory of the machine:
enum RecursiveData {
Many(Vec<RecursiveData>),
One(String),
}
fn loop_flatten(data: RecursiveData) -> Vec<String> {
match data {
RecursiveData::Many(datas) => {
let mut paused = Vec::new();
let mut out = Vec::new();
let mut current = datas.into_iter();
loop {
// Process next value from deepest iterator
while let Some(next) = current.next() {
match next {
RecursiveData::Many(next_datas) => {
// Save current for later, descend into new level
paused.push(std::mem::replace(&mut current, next_datas.into_iter()));
},
RecursiveData::One(string) => out.push(string),
}
}
// Current is exhausted, try resuming
if let Some(prev) = paused.pop() {
current = prev;
} else {
// All done
return out;
}
}
},
RecursiveData::One(string) => vec![string],
}
}
Whether it's possible to do such a transformation will depend on the complexity of your recursive algo :)
Recursive Rust code definitely uses the stack
Sorry, I wasn't very clear.
What I meant was
I had assumed recursive code in rust avoided the python call stack
In other words, you could call recursive code in rust without the significant performance penalty applicable in python because rust doesn't need to do the hard work of maintaining a call stack.
I was fairly confident about that in pure rust, my question was about recursive rust code which accessed a python object, e.g. a nested dict or list.
I think an example is in order:
Let's say we have the following rust code which counts the items in a list, and any sublists:
#[pyfunction]
fn count(x: &PyAny, depth: usize) -> PyResult<i64> {
if depth > 100 {
Ok(0)
} else if let Ok(list) = x.cast_as::<PyList>() {
let mut sum: i64 = 0;
for item in list.iter() {
sum += count(item, depth + 1)?;
}
Ok(sum)
} else {
x.extract::<i64>()
}
}
We can call it thus:
from ... import count
print(count([1, 2, 3], 0))
#> 6
print(count([1, 2, [4, 5]], 0))
#> 12
x = [1]
x.append(x)
print(count(x, 0))
#> 100
My questions is: how does count interact with the python call stack?
On the one hand, if I remove the if depth > 100 { check, and call this function, it seg faults, I never get a python RecursionError.
On the other hand, similar code in pydantic-core can pass or fail based on changing sys.setrecursionlimit(...) with both pypy and emscripten/wasm.
Hence my confusion.
Secondly, is the above code a good place to use Py_EnterRecursiveCall?
the way I tend to do it is transform recursive function calls into loops.
I'm afraid that's really not possible in pydantic-core it's inherently a pile of nested validators which call each other recursively.
When a recursive model needs to be validated, e.g.
class Branch:
name: str
sub_branches: 'list[Branch]'
It's hard to imagine a way to validate this (in a general way, with no knowledge of the schema at compile time) without recursion.
Right, sorry for my misunderstanding above.
So when you call the count(item, depth + 1) from Rust, you are indeed calling the original Rust function and not some wrapper which modifies the Python call stack.
So yes, to avoid infinite recursion causing a stack overflow a depth check is reasonable. It would probably make sense to use Py_EnterRecursiveCall (and Py_LeaveRecursiveCall) because it'll allow users to configure the depth with sys.setrecursionlimit?
Why you're having different experience with PyPy / wasm is a surprise to me, I had a quick peek at the Python implementation of the recursion limit and it looks like it's essentially just a counter on all platforms.
Circling back here, with Py_EnterRecursiveCall and Py_LeaveRecursiveCall now exposed as ffi symbols, I'm going to close this issue as I don't think we're planning to add support for a "safe" API for this for now. If anyone does have a need for this, please comment and we can discuss a design.
Circling back here, with Py_EnterRecursiveCall and Py_LeaveRecursiveCall now exposed as ffi symbols, I'm going to close this issue as I don't think we're planning to add support for a "safe" API for this for now. If anyone does have a need for this, please comment and we can discuss a design.
Makes sense, thanks.