SONG Ge
SONG Ge
Hi @lumurillo , we have reproduced this issue and will inform you when we make progress.
Hi @Kaszanas, you will not see any intel gpu info until you loading a model.
Hi @Cbaoj , you may use `transformers==4.38.2` to get a better performance, we are working on optimizing llama model performance on 4.38.x.
Hi @publicarray, we currently do not support Gemma3, and our support is still working on it. We recommend switching to other models such as Qwen3 or DeepSeek-R1 in the meantime.
Hi @Fucalors , 1. Could you please provide the complete runtime log of the Ollama server side during model inference? 2. Could you please run `ls-sycl-device.exe` and reply us the...
Hi @Fucalors, I don't think you are running ipex-llm ollama. Please double-check your environment and installation method. You may refer to our documentation at https://ipex-llm-latest.readthedocs.io/en/latest/doc/LLM/Quickstart/ollama_quickstart.html for installing Ollama.
> +1 current version is 0.9.x and latest version of ollama supports embedding model at 0.12.x Sry @junesg, but we are not developing ollama currently.
Not sure. I am not working on developing it recently.
@brownplayer, qwen3 model has been supported in our latest version. You may install it via `pip install --pre --upgrade ipex-llm[cpp]`, see https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md for more usage details.
v0.5.4, and we are working on releasing v0.6.2 for linux first.