gpt4all
gpt4all copied to clipboard
CUDA 12.0 error while trying to run in CPU
I tried to run in CPU but getting cuda error
Bug Report
import gpt4all
3 llma_8b = gpt4all.GPT4All(model_name="Meta-Llama-3-8B-Instruct.Q4_0.gguf",
4 model_path="/repository/models/mohammad/llm_models/rag/",
5 device="cpu",
6 allow_download=True)
Running this in Linux but getting the following error.
OSError: /home/mohammad/.virtualenv/rag_env/lib/python3.8/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12: cannot open shared object file: No such file or directory
It seems _pyllmodel.py has this block of code that is trying to import the cuda 12 files. I although have cuda 11.8 in the system I want to run it on CPU due to insufficient memory.
if platform.system() in ('Linux', 'Windows'):
try:
from nvidia import cuda_runtime, cublas
except ImportError:
pass # CUDA is optional
else:
if platform.system() == 'Linux':
cudalib = 'lib/libcudart.so.12'
cublaslib = 'lib/libcublas.so.12'
else: # Windows
cudalib = r'bin\cudart64_12.dll'
cublaslib = r'bin\cublas64_12.dll'
Your Environment
- GPT4All version: 2.7.0
- Operating System: linux
- Chat model used (if applicable): LLMA 8B
I haven't tried to reproduce this, but there should be a new release of the Python bindings soon. In fact, it seems like it's only being held up by an issue with CI. Oh and also some PyPI limitations.
Can you try it again once that is available?
Also see PR #2802 and note specifically:
- Also search for CUDA 11 installed with pip at runtime since we now build against CUDA 11.8 anyway
For now it should be sufficient to pip install nvidia-cublas-cu12 nvidia-cuda-runtime-cu12 as long as your GPU driver is somewhat recent (at least 525.60.13 if we provide binary support for your GPU architecture, or 555.58 if we don't).
Support for CUDA 11 will be available in the next Python release (possibly 2.8.1).
The issue doesn't revolve around whether the user has/hasn't installed cuda 12 drivers for their gpu. but around the fact that when gpt4all.GPT4All() is called with device="cpu", it unnecessarily checks for cuda drivers.