SONG Ge

Results 164 comments of SONG Ge

Hi @RonkyTang, we are working on upgrading ipex-llm ollama into a new version, and these two GLM models could be supported then.

> Hi, [@sgwhat](https://github.com/sgwhat) could you please share the schedule for the release? thanks! I will release v0.6.x support in next week.

Hi @RonkyTang, I have found out the reason, and it will be fixed in tmr's version.

Hi @RonkyTang , I am still working on fixing running this model's clip part on sycl backend. I will come back to you when this issue been fixed after a...

Hi @RonkyTang, we have released the new version of ollama in https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly. We have optimized clip model to run on gpu on windows.

Hi @RonkyTang, Seems on ubuntu, clip still be forced running on cpu (it works well with a great perf on windows), this has been fixed and I will release the...

Hi @RonkyTang, we have released the optimized version on ubuntu, which could run the clip model on GPU. You may install it via `pip install --pre --upgrade ipex-llm[cpp]`

Yes, in the conda env. You may refer to this [installation guide](https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md).

This is expected behavior — Ollama does not utilize the iGPU until a model is loaded, at which point you will see VRAM usage increase. As for the confusing log...

> So, do you mean the preview version used iGPU? Yes, you may load a model to check.