ipex-llm What is the current ollama version

Apr 10 '25 01:04 brownplayer

v0.5.4, and we are working on releasing v0.6.2 for linux first.

Apr 10 '25 02:04 sgwhat

Will a version supporting v0.6.2 for Linux be released within the next two weeks?

Apr 14 '25 04:04 HelloMorningStar

v0.6.2 has been released. You may install it via pip install --pre --upgrade ipex-llm[cpp].

Apr 16 '25 07:04 sgwhat

That's confusing. You install ollama by installing --pre ipex-llm?

Apr 16 '25 18:04 kirel

You can get v0.6.2 packages from https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly now.

Apr 17 '25 01:04 qiuxin2012

Ah. The portable binary. Would building the container via https://github.com/intel/ipex-llm/blob/main/docs/mddocs/DockerGuides/docker_cpp_xpu_quickstart.md also result in this version?

Apr 17 '25 06:04 kirel

@kirel Yes, it's the latest verion we put on pypi, see https://github.com/intel/ipex-llm/blob/73198d5b80dd584de58fc3625ca0cdf78b4f8e42/docker/llm/inference-cpp/Dockerfile#L61

Apr 18 '25 02:04 qiuxin2012

(llm-cpp) ubuntu@ubuntu-NUC12SNKi72:~/ollama-ipex-llm-2.3.0b20250415-ubuntu$ ./ollama --version ollama version is 0.0.0

Apr 20 '25 14:04 HelloMorningStar

I have an Intel Arc B580 Graphics 12 GIG Vram Driver version 1.6.32567+18

I tried both updating current Ollama install via : pip install --pre --upgrade ipex-llm[cpp]

and also Portable.2.30v.zip with same result with Gemma3

it works with most other LLm's example, Mistral-nemo, deepseek-r1 and phi4 and so on, but when I try Gemma3 it always errors as follows:

ime=2025-04-21T15:28:14.524+10:00 level=INFO source=ggml.go:369 msg="compute graph" backend=SYCL0 buffer_type=SYCL0 time=2025-04-21T15:28:14.524+10:00 level=INFO source=ggml.go:369 msg="compute graph" backend=CPU buffer_type=SYCL_Host time=2025-04-21T15:28:14.525+10:00 level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1,3}| ?[^\s\p{L}\p{N}]+[\r\n]|\s[\r\n]+|\s+(?!\S)|\s+" time=2025-04-21T15:28:14.529+10:00 level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.add_eot_token default=false time=2025-04-21T15:28:14.532+10:00 level=WARN source=ggml.go:149 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\r\n\p{L}\p{N}]?\p{L}+|\p{N}{1,3}| ?[^\s\p{L}\p{N}]+[\r\n]|\s[\r\n]+|\s+(?!\S)|\s+" time=2025-04-21T15:28:14.538+10:00 level=WARN source=ggml.go:149 msg="key not found" key=gemma3.attention.layer_norm_rms_epsilon default=9.999999974752427e-07 time=2025-04-21T15:28:14.538+10:00 level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.local.freq_base default=10000 time=2025-04-21T15:28:14.538+10:00 level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.global.freq_base default=1e+06 time=2025-04-21T15:28:14.538+10:00 level=WARN source=ggml.go:149 msg="key not found" key=gemma3.rope.freq_scale default=1 time=2025-04-21T15:28:14.538+10:00 level=WARN source=ggml.go:149 msg="key not found" key=gemma3.mm_tokens_per_image default=256 time=2025-04-21T15:28:14.739+10:00 level=INFO source=server.go:628 msg="llama runner started in 2.68 seconds" [GIN] 2025/04/21 - 15:28:14 | 200 | 2.903381703s | 127.0.0.1 | POST "/api/generate" panic: failed to sample token: no tokens to sample from

I have a second server running x2 Tesla p4's that runs Gemma3 with zero issues, Hopefully it will get fixed to work with my Intel card as it is heaps faster than the Tesla P4's.

thank you

Apr 21 '25 05:04 coderexpress

I am also seeing the exact same issue with Gemma3 on my Arc A770 using the ollama-portable binary. Other things seem to run fine, but Gemma3. Additionally, I'm unable to load in a model larger than VRAM and have it split between CPU and GPU like I can using mainline Ollama.

Apr 22 '25 13:04 afterthought325

https://github.com/chnxq/ollama/tree/chnxq/add-oneapi

You can try the above one. It uses the latest ollama version,gemma3:12b is work. but it has only been tested on Windows. ref: /llama/README-Intel-OneApi.md

Apr 22 '25 15:04 chnxq

https://github.com/intel/ipex-llm/issues/13070

Someone has also testit on Linux.

Apr 22 '25 15:04 chnxq

https://github.com/chnxq/ollama/tree/chnxq/add-oneapi

You can try the above one. It uses the latest ollama version,gemma3:12b is work. but it has only been tested on Windows. ref: /llama/README-Intel-OneApi.md

I will try to install on linux. Thank you for your work

Apr 22 '25 22:04 coderexpress