SONG Ge
SONG Ge
I just cannot figure out why 99% and 99.9% of FP32 are the quality target. 😂
Hi @shailesh837, we are working on reproducing your issue.
1. For your first question, please set environment variables for optimal performance as below (before running your program): ```python export USE_XETLA=OFF export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 export SYCL_CACHE_PERSISTENT=1 ``` 2. For your second...
### Reason Ollama will unload model from gpu memory in every 5 minutes as default. ### Solution 1. For latest version of ollama, you could set `export OLLAMA_KEEP_ALIVE=-1` to keep...
Hi Shailesh, I didn't see any gpu device from `sycl-ls` in your log. Could you please check your oneapi installation and remember to `source /opt/intel/oneapi/setvars.sh`? ```bash spandey2@IMU-NEX-EMR1-SUT:~/LLM_SceneScape_ChatBot$ sycl-ls [opencl:acc:0] Intel(R)...
Hi @shailesh837, 1. You may switch to our latest release version of Ollama by using the command `pip install --pre --upgrade ipex-llm[cpp]`. In this version, `libllama_bigdl_core.so` is no longer required....
Hi @Daroude, I have replicated the issue you're experiencing. Please ensure that you have correctly installed and initialized Intel oneAPI. For example: ```bash # on windows call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat"...
The issue has been reproduced, and we are working on resolving it.
Before starting ollama server, please set the environment config as below: ```bash export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/your_oneapi_version/lib:/opt/intel/oneapi/compiler/your_oneapi_version/lib ```
Hi @Quallyjiang, could you please use `transformers==4.38.2` as a temporary version?