Dongjie Shi
Dongjie Shi
please try with latest Docker image: intelanalytics/ipex-llm-serving-xpu:2.2.0-SNAPSHOT as the data reviewed with you, the 1xARC/2xARC performance should be fine, 4xARC performance is degraded due to the high communication overhead, and...
Sorry, we don't have any plan for optimizing ollama for Xeon/AMX yet.
We are validating 0.6.2 version, and Qwen2-VL model, will notify you once it's ready. Thanks.
closed since no update for a long time