Bing Han
Results
1
comments of
Bing Han
Does seem that QPS need to be selected carefully. I switched to static with just 5 request sent with vllm v1, and P50 e2e latency error is now 30% Do...