ipex-llm
ipex-llm copied to clipboard
Arc770 IPEX-LLM 的性能问题
在测试大模型的过程中,单卡性能符合预期,Arc770 双卡和4卡的场景中性能下降比较严重。 input token -> output token: 1024->512
按照这里文档 build: https://github.com/intel-analytics/ipex-llm/blob/main/docs/readthedocs/source/doc/LLM/DockerGuides/vllm_docker_quickstart.md IPEX-LLM 版本2.1.0b2
please try with latest Docker image: intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT as the data reviewed with you, the 1xARC/2xARC performance should be fine, 4xARC performance is degraded due to the high communication overhead, and we are working in progress in improving it.