mini-sglang
mini-sglang copied to clipboard
feat: Implement request cancellation
Description
This PR implements a complete request cancellation mechanism to prevent GPU resource waste when clients disconnect. It addresses the issue described in #[15].
Verification
before
after
fix #15