feat: Switch to `std::process::Command` when invoking the program binary
The use of command-fds with tokio::process::Command causes rare, deterministic race conditions that lead to witness generation failures. These issues consistently occur for specific block ranges on OP Sepolia, Base, and OP Mainnet in op-succinct.
We should switch to std::process::Command to avoid these race conditions. This will increase resource usage for concurrent witness generation, and we can move back to tokio once the race condition issues are resolved.
I'm not exactly certain the source of the issue, but the Safety section of tokio::process::Command mentions some issues arising from file descriptors: https://tikv.github.io/doc/tokio/process/struct.Command.html#safety
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 93.8%. Comparing base (
38cf22e) to head (4be3cb4).
Additional details and impacted files
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
It's a bit concerning that we have to move to a blocking task to fix this. I'd prefer if we found out the root cause of the issue before making it such that the future that spawns this command cannot be cancelled. That can cause hanging behavior in ways we may not anticipate, if the host program panics.
I think a better approach here, at least for the native run, is to move towards just executing the client program from within the host. It'll require a bit of a larger refactor (likely including some or all of #398 + #361 + #553), but would remove the child process.
My primary suspect here is #553 - that's where I'd start. fwiw, we have not yet ran into this in the upstream kona-client + kona-host pair either.