llama-cpp-python
llama-cpp-python copied to clipboard
Cuda Build Failed, Please take a look at this
Alright.......so I've used these commands to install cuda llama-cpp-python on my windows 11 machine.
set CMAKE_ARGS="-DGGML_CUDA=on"
pip install llama-cpp-python --no-cache-dir
And well, once I run those commands, I get this output:
Collecting llama-cpp-python
Downloading llama_cpp_python-0.3.7.tar.gz (66.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.7/66.7 MB 28.4 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
Downloading numpy-2.2.2-cp311-cp311-win_amd64.whl.metadata (60 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
Downloading jinja2-3.1.5-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
Downloading MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl.metadata (4.1 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
Downloading jinja2-3.1.5-py3-none-any.whl (134 kB)
Downloading numpy-2.2.2-cp311-cp311-win_amd64.whl (12.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.9/12.9 MB 27.9 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl (15 kB)
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [31 lines of output]
*** scikit-build-core 0.10.7 using CMake 3.31.4 (wheel)
*** Configuring CMake...
2025-02-02 01:58:31,074 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
loading initial cache file C:\Users\Aryan\AppData\Local\Temp\tmp93d8axo6\build\CMakeInit.txt
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631.
-- The C compiler identification is MSVC 19.42.34436.0
-- The CXX compiler identification is MSVC 19.42.34436.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.43.0.windows.1")
CMake Error at vendor/llama.cpp/CMakeLists.txt:104 (message):
LLAMA_CUBLAS is deprecated and will be removed in the future.
Use GGML_CUDA instead
Call Stack (most recent call first):
vendor/llama.cpp/CMakeLists.txt:109 (llama_option_depr)
-- Configuring incomplete, errors occurred!
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)
I have all the requirements installed, including the visual studio c++ development kit and the cuda toolkit.
It's currently 2 AM for me, I have lost my mind. I need help with this.
I've successfully built a CUDA wheel. Your environment seems correct.
I did not meet this error when I built this package. Here are some suggestions that maybe useful:
- Try to build a CUDA binary of llama.cpp and verify it.
git clonethe llama-cpp-python repository with submodule and try to build it (pip install build wheelthenpython -m build --wheel).- Install and test the wheel.
BTW, if someone need to build a wheel for CUDA<=12.4, it needs to install MSVC 2019 and set the environment variable manually:
$env:CMAKE_ARGS = "-DGGML_CUDA=ON -DCMAKE_GENERATOR_TOOLSET=v142,host=x64,version=14.29"