Nguyen Nhi Thanh Tai

Results 1 comments of Nguyen Nhi Thanh Tai

If anybody run vllm on Triton server Triton server will auto run your llm instance on every possible GPU. So if you have 2 GPU and you run --tensor-parallel-size 2....