SONG Ge
SONG Ge
Hi @RonkyTang, we are working on upgrading ipex-llm ollama into a new version, and these two GLM models could be supported then.
> Hi, [@sgwhat](https://github.com/sgwhat) could you please share the schedule for the release? thanks! I will release v0.6.x support in next week.
Hi @RonkyTang, I have found out the reason, and it will be fixed in tmr's version.
Hi @RonkyTang , I am still working on fixing running this model's clip part on sycl backend. I will come back to you when this issue been fixed after a...
Hi @RonkyTang, we have released the new version of ollama in https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly. We have optimized clip model to run on gpu on windows.
Hi @RonkyTang, Seems on ubuntu, clip still be forced running on cpu (it works well with a great perf on windows), this has been fixed and I will release the...
Hi @RonkyTang, we have released the optimized version on ubuntu, which could run the clip model on GPU. You may install it via `pip install --pre --upgrade ipex-llm[cpp]`
Yes, in the conda env. You may refer to this [installation guide](https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md).
This is expected behavior — Ollama does not utilize the iGPU until a model is loaded, at which point you will see VRAM usage increase. As for the confusing log...
> So, do you mean the preview version used iGPU? Yes, you may load a model to check.