ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Low parallel requests on Arc with VLLM serving

Open jessie-zhao opened this issue 7 months ago • 2 comments

Got only 10 parallel request on 2 Arc with Qwen1.5 model (1024 input/512 out), could you please to improve the performance?

jessie-zhao avatar Jul 11 '24 07:07 jessie-zhao