SONG Ge comments

Results 164 comments of


                                            SONG Ge

how to switch to load multiple llm models in a streamlit page?

You may clear the model with `del llm_model`.

how to switch to load multiple llm models in a streamlit page?

You may use `st.cache_resource.clear()` to rerun to create a new model as below: ```bash model = create_model(name1) del model st.cache_resource.clear() model = create_model(name2) ```

Error when loading level_zero lib for `zesInit()` on windows client

Here is my implementation, error happens when executed `ret = (*resp->oh.zesInit)(0);` and `ze_intel_gpu64.dll` could be loaded. https://github.com/felipeagc/ollama/blob/main/gpu/gpu_info_oneapi.c https://github.com/felipeagc/ollama/blob/main/gpu/gpu_info_oneapi.h

Error when loading level_zero lib for `zesInit()` on windows client

I also tried `"C:\Windows\System32\ze_loader.dll"`, but still got error, this also related to `ret = (*resp->oh.zesInit)(0);` : ![image](https://github.com/intel/compute-runtime/assets/38711238/5b3a42dc-89c6-4a62-bf95-383abf319d87)

Can't get Ollama to run on Intel Arc B580, 'std::runtime_error'

Are you running ollama in a Dockerfile? And could you show the `sycl-ls` after activating oneapi?

Attempt to add the `mllama` support

May I ask if mllama could be compiled in this PR? I didn't see the relevant CMakeLists.

GPU Runner crash in Ollama when offloading multiple layers

Hi @pauleseifert. I think this should be an OOM issue, you may try to set `OLLAMA_PARALLEL=1` before you start `ollama serve` to reduce memory usage.

GPU Runner crash in Ollama when offloading multiple layers

1. Sorry for the typo error, it should be `OLLAMA_PARALLEL=1` instead of `OLLAMA_NUM_PARALLEL`. 2. Could you please check and provide your GPU memory usage when running Ollama?

GPU Runner crash in Ollama when offloading multiple layers

Can you provide the memory usage before and after running `ollama run `? This can help us resolve the issue.

Please add ollama version/tag number in ipex-llm[cpp]

Hi @jianjungu, you may also see [ipex-llm ollama quickstart](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md) for the current ollama version. ![image](https://github.com/user-attachments/assets/a946a99f-0254-4fc8-95a1-1eb95d5c4a6d)