llama-stack Run ollama gpu distribution failed

System Info

NVIDIA GPU A30

nvidia-smi

Thu Oct 31 11:43:51 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02              Driver Version: 555.42.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A30                     Off |   00000000:01:00.0 Off |                    0 |
| N/A   40C    P0             27W /  165W |      17MiB /  24576MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A30                     Off |   00000000:21:00.0 Off |                    0 |
| N/A   39C    P0             27W /  165W |      17MiB /  24576MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A30                     Off |   00000000:41:00.0 Off |                    0 |
| N/A   37C    P0             29W /  165W |      17MiB /  24576MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A30                     Off |   00000000:61:00.0 Off |                    0 |
| N/A   40C    P0             31W /  165W |      17MiB /  24576MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

Information

[X] The official example scripts
[ ] My own modified scripts

🐛 Describe the bug

[Steps] cd ./distributions/ollama/gpu docker compose up

Error logs

Docker image pull successfully but failed with:

[+] Running 20/20
 ✔ llamastack 10 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                                        163.5s
   ✔ a480a496ba95 Already exists                                                                                                                                     0.0s
   ✔ 9555349e8380 Already exists                                                                                                                                     0.0s
   ✔ 1c161e44b06b Already exists                                                                                                                                     0.0s
   ✔ 417516d8bb61 Already exists                                                                                                                                     0.0s
   ✔ 7b3b2e7600c7 Pull complete                                                                                                                                     10.8s
   ✔ a2a73e3e4c11 Pull complete                                                                                                                                     20.7s
   ✔ d9db787ee8be Pull complete                                                                                                                                    139.4s
   ✔ 8f3db082d007 Pull complete                                                                                                                                     14.9s
   ✔ 01210563e7cf Pull complete                                                                                                                                    116.8s
   ✔ 39ae657a1b29 Pull complete                                                                                                                                     24.3s
 ✔ ollama 8 layers [⣿⣿⣿⣿⣿⣿⣿⣿]      0B/0B      Pulled                                                                                                               319.1s
   ✔ 6414378b6477 Already exists                                                                                                                                     0.0s
   ✔ d84c10dcd047 Pull complete                                                                                                                                     28.1s
   ✔ 4a85dc2f00a0 Pull complete                                                                                                                                     36.1s
   ✔ 8df458f6a2c6 Pull complete                                                                                                                                     41.8s
   ✔ 8d8fd8dac143 Pull complete                                                                                                                                     47.0s
   ✔ 3eb0af8d9bf5 Pull complete                                                                                                                                     54.8s
   ✔ bfbabfde94f6 Pull complete                                                                                                                                    228.8s
   ✔ 746a9d594ec4 Pull complete                                                                                                                                    301.0s
[+] Running 5/2
 ✔ Volume "gpu_ollama"                                                   Created                                                                                     0.0s
 ✔ Container gpu-ollama-1                                                Created                                                                                     0.4s
 ! ollama Published ports are discarded when using host network mode                                                                                                 0.0s
 ✔ Container gpu-llamastack-1                                            Created                                                                                     0.0s
 ! llamastack Published ports are discarded when using host network mode                                                                                             0.0s
Attaching to gpu-llamastack-1, gpu-ollama-1
Error response from daemon: error gathering device information while adding custom device "nvidia.com/gpu=all": no such file or directory

Expected behavior

docker compose up successfully

Oct 31 '24 03:10 alexhegit

Wondering if you able to run the following command to start ollama?

$ docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 ollama/ollama

Then use the following command to start llama-stack server

$ cd llama-stack/distributions/ollama/gpu
$ docker run --network host -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/llamastack-run-ollama.yaml llamastack/distribution-ollama --yaml_config /root/llamastack-run-ollama.yaml

Nov 01 '24 18:11 yanxi0830

docker run --network host -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/llamastack-run-ollama.yaml llamastack/distribution-ollama --yaml_config /root/llamastack-run-ollama.yaml

This works. The log is bellow

...
Listening on ['::', '0.0.0.0']:5000
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit)

But how about 'docker compose up' ? I run it and still got failed as before.

$ docker compose up
[+] Running 4/0
 ✔ Container gpu-ollama-1                                                Created                                                                                                                          0.0s
 ! ollama Published ports are discarded when using host network mode                                                                                                                                      0.0s
 ✔ Container gpu-llamastack-1                                            Created                                                                                                                          0.0s
 ! llamastack Published ports are discarded when using host network mode                                                                                                                                  0.0s
Attaching to llamastack-1, ollama-1
Gracefully stopping... (press Ctrl+C again to force)
Error response from daemon: error gathering device information while adding custom device "nvidia.com/gpu=all": no such file or directory

Nov 03 '24 12:11 alexhegit

This issue has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.

Mar 14 '25 00:03 github-actions[bot]

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant!

Apr 13 '25 00:04 github-actions[bot]

llama-stack llama-stack copied to clipboard

Run ollama gpu distribution failed

System Info

Information

🐛 Describe the bug

Error logs

Expected behavior

llama-stack
llama-stack copied to clipboard