Xiaobin
Xiaobin
The same question. Is there any solution to solve it?
How about using `nvml.DeviceGetTopologyCommonAncestor` to build gpu tree
And I don't think using device-plugin to build gpu tree is a good way. Although device-plugin can use with volcano cooperatively, we still need to judge if the gpu resource...
> > And I don't think using device-plugin to build gpu tree is a good way. Although device-plugin can use with volcano cooperatively, we still need to judge if the...
Hello, I followed the [blog](https://jacobtomlinson.dev/posts/2022/quick-hack-adding-gpu-support-to-kind/) to add gpu support in kind cluster. When install `k8s-device-plugin` and met problem following ``` Error: failed to create containerd task: failed to create shim...
Hello, I have some questions about kServe model storage. If I have a model installed from huggingface and a python file to load model, how to use `InferenceService` CRD? Like...
Hello, I wonder know why need to put prefill and decode pods into same roleset here?
It is not a bug, just modify logic to use diff value. The result is the same with now. Confirm if it is necessary.
one vllm for multi-model or different vllm processes?