llama-cpp-python
llama-cpp-python copied to clipboard
CUDA not supported. `ValueError: Attempt to split tensors that exceed maximum supported devices. Current LLAMA_MAX_DEVICES=1`
trafficstars
This was a problem that I think was prematurely closed:
https://github.com/abetlen/llama-cpp-python/issues/1166
My current efforts are to get a llama 3.1 70B gguf running on 2 3090s, and no matter my installation method, I'm getting the same error. Moreover, it appears llama_cpp.llama_supports_gpu_offload() always reports False even though it can use a single GPU.
Error:
# it never even hears the ENV VAR (!?), still reports as 1 device
$ LLAMA_MAX_DEVICES=2 my_thing
ValueError: Attempt to split tensors that exceed maximum supported devices. Current LLAMA_MAX_DEVICES=1
Installation Methods:
# 1
pip install llama-cpp-python \
--extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122
# 2
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
# 4
pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
# 5
pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 --upgrade --force-reinstall --no-cache-dir
# 6 downgrade was reported in this Issue to work, but does not
pip install llama-cpp-python==0.2.77 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 --upgrade --force-reinstall
# 7
pip install llama-cpp-python==0.2.76 --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu122 --upgrade --force-reinstall
# 8
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python==0.2.77 --upgrade --force-reinstall --no-cache-dir --verbose
# 9 build fails
git checkout "v0.2.77"
CMAKE_ARGS="-DGGML_CUDA=on" pip install -e . --upgrade --force-reinstall --no-cache-dir --verbose
CMake Error at CMakeLists.txt:25 (add_subdirectory):
The source directory
llama-cpp-python/vendor/llama.cpp
does not contain a CMakeLists.txt file.
# 10 downgrade build fails the same
git clone ...
CMAKE_ARGS="-DGGML_CUDA=on" pip install -e ../lib/llama-cpp-python/ --verbose
# 11 maybe we copy that CMakeLists in? nope.
$ cp CMakeLists.txt vendor/llama.cpp/
$ CMAKE_ARGS="-DGGML_CUDA=on" pip install -e . --upgrade --force-reinstall --no-cache-dir --verbose
CMake Error at vendor/llama.cpp/CMakeLists.txt:25 (add_subdirectory):
add_subdirectory given source "vendor/llama.cpp" which is not an existing
directory.
After each attempt (besides builds, which all fail):
$ python -c "import llama_cpp; print(llama_cpp.llama_max_devices())"
1
$ python -c "import llama_cpp; print(llama_cpp.llama_supports_gpu_offload())"
False
$ python3 -c "import torch; print(torch.cuda.device_count())"
2
Originally posted by @freckletonj in https://github.com/abetlen/llama-cpp-python/issues/1166#issuecomment-2294990187
I was running into the same, reported here https://github.com/abetlen/llama-cpp-python/issues/1693