Zaire
Zaire
深色模式有点问题,不知道是不是只有我有 文件名不是很清楚  深色模式下右键颜色  
### Description This PR implements a complete request cancellation mechanism to prevent GPU resource waste when clients disconnect. It addresses the issue described in #[15]. ### Verification before after fix...
### Describe the bug When a client disconnects (e.g., Ctrl+C via curl), the backend (Scheduler/GPU) continues to generate tokens until `max_seq_len` is reached. This wastes GPU resources. ### Reproduction Use...