ray_vllm_inference
ray_vllm_inference copied to clipboard

Published 20 hours ago •

Reame
Issues

Can I run a model service on multiple GPUs？

Open zrl4836 opened this issue 1 year ago • 1 comments

I want to run a model service on multiple GPUs without TP。

Nov 29 '23 08:11 zrl4836