ipex-llm Ollama Linux No Response Issue with IPEX-LLM

OS: Linux Ubuntu 22.04 Kernel：5.13 显卡：A770 平台：RPL-P 在按照guide安装并启动ollama后，出现query没反应的情况，ollama侧也没有任何的打印。 | | | |Compute |Max compute|Max work|Max sub| | |ID| Device Type| Name|capability|units |group |group |Global mem size| | 0|[level_zero:gpu:0]| Intel(R) Arc(TM) A770 Graphics| 1.3| 512| 1024| 32| 16225243136| ggml_backend_sycl_set_mul_device_mode: true detect 1 SYCL GPUs: [0] with top Max compute units:512 llm_load_tensors: ggml ctx size = 0.30 MiB llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading non-repeating layers to GPU llm_load_tensors: offloaded 33/33 layers to GPU llm_load_tensors: SYCL0 buffer size = 3577.56 MiB llm_load_tensors: CPU buffer size = 70.31 MiB .................................................................................................. llama_new_context_with_model: n_ctx = 2048 llama_new_context_with_model: n_batch = 512 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: freq_base = 10000.0 llama_new_context_with_model: freq_scale = 1 llama_kv_cache_init: SYCL0 KV buffer size = 1024.00 MiB llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB llama_new_context_with_model: SYCL_Host output buffer size = 0.14 MiB llama_new_context_with_model: SYCL0 compute buffer size = 180.00 MiB llama_new_context_with_model: SYCL_Host compute buffer size = 12.01 MiB llama_new_context_with_model: graph nodes = 1062 llama_new_context_with_model: graph splits = 2

May 23 '24 02:05 RobinJing

The issue has been reproduced, and we are working on resolving it.

May 23 '24 05:05 sgwhat

Before starting ollama server, please set the environment config as below:

export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/your_oneapi_version/lib:/opt/intel/oneapi/compiler/your_oneapi_version/lib

May 23 '24 07:05 sgwhat