Ltryxcy

Results 2 issues of Ltryxcy

### Prerequisites - [X] I have read the [ServerlessLLM documentation](https://serverlessllm.github.io/). - [X] I have searched the [Issue Tracker](https://github.com/ServerlessLLM/ServerlessLLM/issues) to ensure this hasn't been reported before. ### System Information OS: Ubuntu...

Bug
Priority 0

When I launch four model instances—Qwen3-8B, Llama-3.1-8B, Llama-3.2-3B, and Qwen3-4B—on two A100 GPUs, the system raises a CUDA out-of-memory error and one instance face memory management error. **The following commands...