Peng Jiang comments

Results 24 comments of


                                            Peng Jiang

[Question]: GPUStack error to embed

If the error is "input is too large to process", it may be related to this issue https://github.com/gpustack/gpustack/issues/950. It's not a problem on the GPUStack side, but we plan to...

Failed to deploy bge-reranker-v2-m3 on AMD

May be related to https://github.com/vllm-project/vllm/issues/28184

Avoid gateway port conflicts and show error message in gpustack container when there is a conflict

Should be considered together with https://github.com/gpustack/gpustack/issues/3293 https://github.com/gpustack/gpustack/issues/3525

Auto scheduling will run DeepSeek-v3.2 with TP=7 and failed to start

Refer to https://github.com/gpustack/gpustack/issues/814

Is GPUStack v2 a simplified UI for Alibaba Higress?

We use Higress as an embedded AI gateway. Besides Higress, there are many other components and features in V2.

"AttributeError: 'tuple' object has no attribute 'device'" error in vLLM v0.9.2 and LMCache 0.3.9post2 in non-CUDA env

Tried with 0.3.9post2, which should be compatible with vllm v0.9.2 with --no-build-isolation according to the compatibility matrix.

"AttributeError: 'tuple' object has no attribute 'device'" error in vLLM v0.9.2 and LMCache 0.3.9post2 in non-CUDA env

It's a Hygon GPU environment, the framework is DTK, which is compatible with ROCm.

gpustack2.0版本如何在部署模型的时候，如何自定义VLLM的版本？部署DeepSeek-OCR报错？

You can add customized vllm backend here: The error "The model contains custom code that must be executed to load correctly. If you trust the source, please pass the backend...

deepseek ocr model cannot be found in local deployment

1. For Docker deployment, please ensure you mapped the directory to the GPUStack container with the correct path. 2. If there is a standalone server and worker and the file...

deepseek ocr model cannot be found in local deployment

You have several options: 1. Provide an NFS share folder and mount it to the same path on all worker nodes or even the server node. Then all the nodes...