Peng Jiang
Peng Jiang
**Describe the bug** Resource label change will trigger re-deployment **Steps to reproduce** 1. Create a resource. 2. Modify the resource label, for example, change the stoppable from true to false....
I'm trying vLLM/LMCache with DeepSeek-R1-Distilled-Qwen14B model. With vLLM 0.9.2 and LMCache 0.3.5 in a non-cuda env, I got the following error during vLLM startup ``` ERROR 11-29 19:14:33 [core.py:589] EngineCore...
### GPUStack version v2.0.0 ### Operating System & CPU Architecture Ubuntu22.04 ### GPU H200x8 ### ▶️ Steps to reproduce Run deepseek-v3.2 with 8xH200 will set TP=7 ``` 2025-12-02 13:33:57.757875+08:00 -...
### GPUStack version v2.0.0 ### Operating System & CPU Architecture Ubuntu 22.04 ### GPU Nvidia H200x8 ### ▶️ Steps to reproduce 1. In a standalone CPU GPUStack server in Tencent...
Currently the cross-worker distributed inference is fully automated can't be controlled by user, which is not convenient and may not provide the best performance in some cases. Some examples: 1....
### ❓ Is your enhancement related to a problem? Move the default Higress config from /opt/data to /etc/xxx to avoid issues in https://github.com/gpustack/gpustack/issues/3571 ### 💡 Describe the solution you'd like...
Avoid gateway port conflicts and show error message in gpustack container when there is a conflict
### ❓ Is your enhancement related to a problem? Currently, the built-in higress gateway requires multiple ports, but users are not aware of it. When there is a port conflict,...