ray_vllm_inference icon indicating copy to clipboard operation
ray_vllm_inference copied to clipboard

Can I run a model service on multiple GPUs?

Open zrl4836 opened this issue 1 year ago • 1 comments

I want to run a model service on multiple GPUs without TP。

zrl4836 avatar Nov 29 '23 08:11 zrl4836