ipex-llm
ipex-llm copied to clipboard
Low parallel requests on Arc with VLLM serving
Got only 10 parallel request on 2 Arc with Qwen1.5 model (1024 input/512 out), could you please to improve the performance?