Fix maximum recursion depth triggered on exception exit
Fix https://github.com/sgl-project/sglang/issues/3518
Motivation
Modifications
Checklist
- [X] Format your code according to the Code Formatting with Pre-Commit.
- [ ] Add unit tests as outlined in the Running Unit Tests.
- [ ] Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
- [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
- [ ] For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Hey I don't think this fix the error:
Your current patch only addresses scenarios where SIGKILL might not be effective by adding an additional SIGQUIT attempt. It does not prevent psutil from entering a deep recursion when invoked in a signal handler to inspect the current process’s children.
To fully resolve the RecursionError it's better to refactor the signal-handling approach—avoiding complex psutil calls inside the signal handler—and/or update how you gather and kill child processes.
Thanks, I will review it later and modify it.
@kebe7jun Have you done it?
I tried reproducing locally, but without success, can you provide some inspirations? This patch should solve the problem.
@kebe7jun I will merge it thanks.
@zhaochenyang20 hi, can this pr be merged?
@kebe7jun If the CI goese well. I will merge it later...
@zhaochenyang20 ci goes well.
@kebe7jun merged! Thanks!
please which version can run now?
the latest commit on main should be fine. We haven't release a new verison to debug it.