ggml_cuda_init:failed to initialize CUDA:initialization error
help,
I have deployed the large model Ollama on an offline environment with Ubuntu 18.04.3, and when running the llama3:8b model, I found that the GPU was not being used, only the CPU was being utilized. Upon checking the logs, I discovered an error message: 'ggml_cuda_init: failed to initialize CUDA: initialization error'."
but this do not work. What should I do?
pytorch version: 1.4.0 CUDA version: 10.0 GPU configuration: NVIDIA T4
have you checked minimum required CUDA version / NVIDIA driver version for latest ggml? also you can check ggml, and llama.cpp repos for more help on this issue
- https://github.com/ggerganov/ggml
- https://github.com/ggerganov/llama.cpp
@sushaofeng123 Please did you end up resolving the problem , i have the exact same problem and I almost tried everything , nothing seems to work !