Arc770 IPEX-LLM 的性能问题

Open yangluchina opened this issue 1 year ago • 1 comments

在测试大模型的过程中，单卡性能符合预期，Arc770 双卡和4卡的场景中性能下降比较严重。 input token -> output token: 1024->512

按照这里文档 build： https://github.com/intel-analytics/ipex-llm/blob/main/docs/readthedocs/source/doc/LLM/DockerGuides/vllm_docker_quickstart.md IPEX-LLM 版本2.1.0b2

Sep 30 '24 09:09 yangluchina

please try with latest Docker image: intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT as the data reviewed with you, the 1xARC/2xARC performance should be fine, 4xARC performance is degraded due to the high communication overhead, and we are working in progress in improving it.

Oct 08 '24 01:10 glorysdj