llama-stack
llama-stack copied to clipboard
Run ollama gpu distribution failed
System Info
NVIDIA GPU A30
nvidia-smi
Thu Oct 31 11:43:51 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02 Driver Version: 555.42.02 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A30 Off | 00000000:01:00.0 Off | 0 |
| N/A 40C P0 27W / 165W | 17MiB / 24576MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A30 Off | 00000000:21:00.0 Off | 0 |
| N/A 39C P0 27W / 165W | 17MiB / 24576MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A30 Off | 00000000:41:00.0 Off | 0 |
| N/A 37C P0 29W / 165W | 17MiB / 24576MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A30 Off | 00000000:61:00.0 Off | 0 |
| N/A 40C P0 31W / 165W | 17MiB / 24576MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
Information
- [X] The official example scripts
- [ ] My own modified scripts
🐛 Describe the bug
[Steps] cd ./distributions/ollama/gpu docker compose up
Error logs
Docker image pull successfully but failed with:
[+] Running 20/20
✔ llamastack 10 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 163.5s
✔ a480a496ba95 Already exists 0.0s
✔ 9555349e8380 Already exists 0.0s
✔ 1c161e44b06b Already exists 0.0s
✔ 417516d8bb61 Already exists 0.0s
✔ 7b3b2e7600c7 Pull complete 10.8s
✔ a2a73e3e4c11 Pull complete 20.7s
✔ d9db787ee8be Pull complete 139.4s
✔ 8f3db082d007 Pull complete 14.9s
✔ 01210563e7cf Pull complete 116.8s
✔ 39ae657a1b29 Pull complete 24.3s
✔ ollama 8 layers [⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 319.1s
✔ 6414378b6477 Already exists 0.0s
✔ d84c10dcd047 Pull complete 28.1s
✔ 4a85dc2f00a0 Pull complete 36.1s
✔ 8df458f6a2c6 Pull complete 41.8s
✔ 8d8fd8dac143 Pull complete 47.0s
✔ 3eb0af8d9bf5 Pull complete 54.8s
✔ bfbabfde94f6 Pull complete 228.8s
✔ 746a9d594ec4 Pull complete 301.0s
[+] Running 5/2
✔ Volume "gpu_ollama" Created 0.0s
✔ Container gpu-ollama-1 Created 0.4s
! ollama Published ports are discarded when using host network mode 0.0s
✔ Container gpu-llamastack-1 Created 0.0s
! llamastack Published ports are discarded when using host network mode 0.0s
Attaching to gpu-llamastack-1, gpu-ollama-1
Error response from daemon: error gathering device information while adding custom device "nvidia.com/gpu=all": no such file or directory
Expected behavior
docker compose up successfully
Wondering if you able to run the following command to start ollama?
$ docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 ollama/ollama
Then use the following command to start llama-stack server
$ cd llama-stack/distributions/ollama/gpu
$ docker run --network host -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/llamastack-run-ollama.yaml llamastack/distribution-ollama --yaml_config /root/llamastack-run-ollama.yaml
docker run --network host -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/llamastack-run-ollama.yaml llamastack/distribution-ollama --yaml_config /root/llamastack-run-ollama.yaml
This works. The log is bellow
...
Listening on ['::', '0.0.0.0']:5000
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://['::', '0.0.0.0']:5000 (Press CTRL+C to quit)
But how about 'docker compose up' ? I run it and still got failed as before.
$ docker compose up
[+] Running 4/0
✔ Container gpu-ollama-1 Created 0.0s
! ollama Published ports are discarded when using host network mode 0.0s
✔ Container gpu-llamastack-1 Created 0.0s
! llamastack Published ports are discarded when using host network mode 0.0s
Attaching to llamastack-1, ollama-1
Gracefully stopping... (press Ctrl+C again to force)
Error response from daemon: error gathering device information while adding custom device "nvidia.com/gpu=all": no such file or directory
This issue has been automatically marked as stale because it has not had activity within 60 days. It will be automatically closed if no further activity occurs within 30 days.
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant!