fix: add stacker and maybe_grow on recursion guard
Changed
under #[cfg(feature="std")], stacker::maybe_grow was called in RecursionCounter::try_decrease to prevent stack overflow
I also tested on Windows(stack size 1MB), it seems fine.
reproducible code from original issue
use datafusion::prelude::{ParquetReadOptions, SessionContext};
#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let ctx = SessionContext::new();
ctx.register_parquet(
"parquet_table",
"batches.parquet",
ParquetReadOptions::default(),
)
.await?;
let sql_query = "
SELECT seq
FROM (
SELECT seq,
LEAD(seq) OVER (ORDER BY seq) AS next_seq
FROM parquet_table
) AS subquery
WHERE next_seq - seq > 1";
let df = ctx.sql(sql_query).await?;
df.show().await?;
Ok(())
}
Missing
- [ ] Return error when growing stack fail: growing stack on unsupported target seems to be no-op on stacker side, so I am unable to detect it.
- [ ] Test: which part should be tested?
- [ ] Observe performance impact
I evaluated stacker as part of sqlparser and I thought it was doing some too crazy stuff that made it hard to use in embedded / wasm environments. Maybe that is better now
Originally posted by @alamb in https://github.com/apache/datafusion/issues/9375#issuecomment-2197393452
I think the use of stacker here is fine under #[cfg(feature="std")], and I totally forget to add stacker as optional dependency :smile: .
I have add stacker as optional dependency, so it won't be used in no std environment.
*It will continue to stack overflow in this case.
What's your opinion on the use of stacker? @alamb
I have add
stackeras optional dependency, so it won't be used in no std environment.*It will continue to stack overflow in this case.
What's your opinion on the use of stacker? @alamb
I think stacker is a clever hack but we should fix the code for real to avoid deeply nested callstacks rather than hoping that a third-party library will bail us out.
I don't think it is a good idea to use stacker as I don't want to be responsible for trying to debug any issues caused by its use. That may sound selfish, but I am already at my cognitive complexity load for sqlparser (and various other crates)
That's okay. (I think this PR as a draft to demonstrate what I would like to change
Then I would try to reduce stack usage with another PR.
Can I mark this PR as draft again or just create a new one?
@Eason0729 thanks for looking to improve this behavior! I think should be good to create a new PR (we can link also this PR in the description since they're related)
Marking this as draft in the meantime