SONG Ge

Results 164 comments of SONG Ge

You may clear the model with `del llm_model`.

You may use `st.cache_resource.clear()` to rerun to create a new model as below: ```bash model = create_model(name1) del model st.cache_resource.clear() model = create_model(name2) ```

Here is my implementation, error happens when executed `ret = (*resp->oh.zesInit)(0);` and `ze_intel_gpu64.dll` could be loaded. https://github.com/felipeagc/ollama/blob/main/gpu/gpu_info_oneapi.c https://github.com/felipeagc/ollama/blob/main/gpu/gpu_info_oneapi.h

I also tried `"C:\Windows\System32\ze_loader.dll"`, but still got error, this also related to `ret = (*resp->oh.zesInit)(0);` : ![image](https://github.com/intel/compute-runtime/assets/38711238/5b3a42dc-89c6-4a62-bf95-383abf319d87)

Are you running ollama in a Dockerfile? And could you show the `sycl-ls` after activating oneapi?

May I ask if mllama could be compiled in this PR? I didn't see the relevant CMakeLists.

Hi @pauleseifert. I think this should be an OOM issue, you may try to set `OLLAMA_PARALLEL=1` before you start `ollama serve` to reduce memory usage.

1. Sorry for the typo error, it should be `OLLAMA_PARALLEL=1` instead of `OLLAMA_NUM_PARALLEL`. 2. Could you please check and provide your GPU memory usage when running Ollama?

Can you provide the memory usage before and after running `ollama run `? This can help us resolve the issue.

Hi @jianjungu, you may also see [ipex-llm ollama quickstart](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md) for the current ollama version. ![image](https://github.com/user-attachments/assets/a946a99f-0254-4fc8-95a1-1eb95d5c4a6d)