vllm
vllm copied to clipboard
higher latency than TGI
Is it normal to have higher latency than TGI with a low concurrency, such as 1 or 4?