Bing Han

Results 1 comments of Bing Han

Does seem that QPS need to be selected carefully. I switched to static with just 5 request sent with vllm v1, and P50 e2e latency error is now 30% Do...