ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Ollama offload to the CPU then crash

Open FilipLaurentiu opened this issue 6 months ago • 5 comments

Describe the bug Intel B580 I use the docker container, latest version but the bug also reproduce for the precompile. I tried with multiple models, deepseek-r1:14b, gemma3:4b-it-fp16, same results, ollama offload the computation to the CPU then crash, sometimes freeze my pc and I need to restart. I can't read much from the logs.

log.txt

docker-compose file I use

name: 'ollama'

services:
  intel-llm:
    image: intelanalytics/ipex-llm-inference-cpp-xpu:latest
    container_name: intel-llm
    devices:
      - /dev/dri
    environment:
      no_proxy: localhost,127.0.0.1
      OLLAMA_HOST: 0.0.0.0
      DEVICE: Arc
      HOSTNAME: intel-llm
      OLLAMA_NUM_GPU: "999"
      ZES_ENABLE_SYSMAN: "1"
    volumes:
      - models:/root/.ollama/models
    ports:
      - "11434:11434"
    restart: always
    command: >
      sh -c 'mkdir -p /llm/ollama && cd /llm/ollama && init-ollama && exec ./ollama serve'

  openwebui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: openwebui
    environment:
      OLLAMA_BASE_URL: http://intel-llm:11434
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "8080:8080"
    depends_on:
      - intel-llm

volumes:
  models:
    driver: local
    driver_opts:
      o: bind
      type: none
      device: /path/to/models
  open-webui:
    driver: local
    driver_opts:
      o: bind
      type: none
      device: /path/to/open-webui

How to reproduce Steps to reproduce the error: Not sure exactly how to reproduce, the problem seems to appear when I ask something complex, but if I input something very simple first, like "hello", ollama load the model and then ask something more complex seems to help.

Environment information Arch linux but probably not relevant because I use the docker container

➜  Downloads ./env-check.sh
-----------------------------------------------------------------
PYTHON_VERSION=3.13.3
-----------------------------------------------------------------
transformers=4.52.4
-----------------------------------------------------------------
torch=2.7.0
-----------------------------------------------------------------
ipex-llm WARNING: Package(s) not found: ipex-llm
-----------------------------------------------------------------
IPEX is not installed.
-----------------------------------------------------------------
CPU Information:
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           52 bits physical, 57 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               GenuineIntel
Model name:                              Intel(R) Xeon(R) w3-2435
CPU family:                              6
Model:                                   143
Thread(s) per core:                      2
Core(s) per socket:                      8
Socket(s):                               1
Stepping:                                8
CPU(s) scaling MHz:                      39%
CPU max MHz:                             4500.0000
CPU min MHz:                             800.0000
-----------------------------------------------------------------
Total CPU Memory: 62.2652 GB
-----------------------------------------------------------------
Operating System:
\S{PRETTY_NAME} \r (\l)

-----------------------------------------------------------------
Linux filip-thinkstation 6.15.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Jun 2025 21:32:33 +0000 x86_64 GNU/Linux
-----------------------------------------------------------------
CLI:
    Version: 1.2.35.20240423
    Build ID: efa70d34

Service:
    Version: 1.2.35.20240423
    Build ID: efa70d34
    Level Zero Version: 1.21.9
-----------------------------------------------------------------
fgrep: warning: fgrep is obsolescent; using grep -F
-----------------------------------------------------------------
Driver related package version:
./env-check.sh: line 161: dpkg: command not found
-----------------------------------------------------------------
./env-check.sh: line 167: sycl-ls: command not found
igpu not detected
-----------------------------------------------------------------
xpu-smi is properly installed.
-----------------------------------------------------------------
No device discovered
GPU0 Memory size=16G
-----------------------------------------------------------------
19:00.0 VGA compatible controller: Intel Corporation Battlemage G21 [Arc B580] (prog-if 00 [VGA controller])
        Subsystem: Device 172f:4215
        Flags: bus master, fast devsel, latency 0, IRQ 97, NUMA node 0
        Memory at a6000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 201800000000 (64-bit, prefetchable) [size=16G]
        Expansion ROM at a7000000 [disabled] [size=2M]
        Capabilities: <access denied>
        Kernel driver in use: xe
        Kernel modules: xe
-----------------------------------------------------------------

FilipLaurentiu avatar Jun 12 '25 13:06 FilipLaurentiu

Hi I am also a Linux user. I think it's not due to offloading to cpu, but your sycl config error. Maybe you can activate oneapi and run sycl-ls to check.

Ellie-Williams-007 avatar Jun 13 '25 01:06 Ellie-Williams-007

Hi I am also a Linux user. I think it's not due to offloading to cpu, but your sycl config error. Maybe you can activate oneapi and run sycl-ls to check.

well..it shouldn't be relevant if I run it in a container. I use Arch and I can't find sycl-ls

FilipLaurentiu avatar Jun 13 '25 11:06 FilipLaurentiu

sir, I think you may need yo install a couple of missing packages on your Docker system : dpkg: command not found... sycl-ls: command not found.... please just enable a terminal session to your Docker and install the necessary. In the startup of llama, it should tell you if it has found some dedicated GPU... and intend to use it or not. If not finding B580, you might want to start there. Best of luck.

stigva avatar Jun 14 '25 13:06 stigva

sir, I think you may need yo install a couple of missing packages on your Docker system : dpkg: command not found... sycl-ls: command not found.... please just enable a terminal session to your Docker and install the necessary. In the startup of llama, it should tell you if it has found some dedicated GPU... and intend to use it or not. If not finding B580, you might want to start there. Best of luck.

Sorry for the confusion, the output of env-check.sh script was from my computer. The docker container is created by intel so I didn't thought it was neccessary to post that, but here is the output of the env-check.sh run inside the latest docker container. It might be a problem with the recent syscl version

root@43d4a13eeb33:/llm# ./env-check.sh 
-----------------------------------------------------------------
PYTHON_VERSION=3.11.13
-----------------------------------------------------------------
/usr/local/lib/python3.11/dist-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
transformers=4.36.2
-----------------------------------------------------------------
torch=2.2.0+cu121
-----------------------------------------------------------------
ipex-llm Version: 2.3.0b20250612
-----------------------------------------------------------------
IPEX is not installed. 
-----------------------------------------------------------------
CPU Information: 
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           52 bits physical, 57 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               GenuineIntel
Model name:                              Intel(R) Xeon(R) w3-2435
CPU family:                              6
Model:                                   143
Thread(s) per core:                      2
Core(s) per socket:                      8
Socket(s):                               1
Stepping:                                8
CPU max MHz:                             4500.0000
CPU min MHz:                             800.0000
BogoMIPS:                                6192.00
-----------------------------------------------------------------
Total CPU Memory: 62.2652 GB
Memory Type: sudo: dmidecode: command not found
-----------------------------------------------------------------
Operating System: 
Ubuntu 22.04.5 LTS \n \l

-----------------------------------------------------------------
Linux 43d4a13eeb33 6.15.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Jun 2025 21:32:33 +0000 x86_64 x86_64 x86_64 GNU/Linux
-----------------------------------------------------------------
./env-check.sh: line 148: xpu-smi: command not found
-----------------------------------------------------------------
./env-check.sh: line 154: clinfo: command not found
-----------------------------------------------------------------
Driver related package version:
ii  intel-level-zero-gpu                             1.6.32224.5                             amd64        Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii  intel-level-zero-gpu-legacy1                     1.3.30872.22                            amd64        Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii  level-zero-devel                                 1.20.2                                  amd64        oneAPI Level Zero
-----------------------------------------------------------------
igpu not detected
-----------------------------------------------------------------
xpu-smi is not installed. Please install xpu-smi according to README.md

FilipLaurentiu avatar Jun 15 '25 12:06 FilipLaurentiu

This intel team has released a new linux portable zip release, give it a try to see if that works for you man.

Ellie-Williams-007 avatar Jun 16 '25 01:06 Ellie-Williams-007