ggml_cuda_init:failed to initialize CUDA:initialization error

Open sushaofeng123 opened this issue 1 year ago • 2 comments

help, I have deployed the large model Ollama on an offline environment with Ubuntu 18.04.3, and when running the llama3:8b model, I found that the GPU was not being used, only the CPU was being utilized. Upon checking the logs, I discovered an error message: 'ggml_cuda_init: failed to initialize CUDA: initialization error'."

but this do not work. What should I do?

pytorch version: 1.4.0 CUDA version: 10.0 GPU configuration: NVIDIA T4

May 21 '24 02:05 sushaofeng123

have you checked minimum required CUDA version / NVIDIA driver version for latest ggml? also you can check ggml, and llama.cpp repos for more help on this issue

https://github.com/ggerganov/ggml
https://github.com/ggerganov/llama.cpp

May 21 '24 10:05 M-Ali-ML

@sushaofeng123 Please did you end up resolving the problem , i have the exact same problem and I almost tried everything , nothing seems to work !

Jan 07 '25 16:01 Oussamayousre