llama-cpp-python Cuda Build Failed, Please take a look at this

Cuda Build Failed, Please take a look at this

Open RealUnrealGameDev opened this issue 10 months ago • 1 comments

Alright.......so I've used these commands to install cuda llama-cpp-python on my windows 11 machine.

set CMAKE_ARGS="-DGGML_CUDA=on"

pip install llama-cpp-python --no-cache-dir

And well, once I run those commands, I get this output:

Collecting llama-cpp-python
  Downloading llama_cpp_python-0.3.7.tar.gz (66.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.7/66.7 MB 28.4 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
  Downloading typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Collecting numpy>=1.20.0 (from llama-cpp-python)
  Downloading numpy-2.2.2-cp311-cp311-win_amd64.whl.metadata (60 kB)
Collecting diskcache>=5.6.1 (from llama-cpp-python)
  Downloading diskcache-5.6.3-py3-none-any.whl.metadata (20 kB)
Collecting jinja2>=2.11.3 (from llama-cpp-python)
  Downloading jinja2-3.1.5-py3-none-any.whl.metadata (2.6 kB)
Collecting MarkupSafe>=2.0 (from jinja2>=2.11.3->llama-cpp-python)
  Downloading MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl.metadata (4.1 kB)
Downloading diskcache-5.6.3-py3-none-any.whl (45 kB)
Downloading jinja2-3.1.5-py3-none-any.whl (134 kB)
Downloading numpy-2.2.2-cp311-cp311-win_amd64.whl (12.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.9/12.9 MB 27.9 MB/s eta 0:00:00
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Downloading MarkupSafe-3.0.2-cp311-cp311-win_amd64.whl (15 kB)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [31 lines of output]
      *** scikit-build-core 0.10.7 using CMake 3.31.4 (wheel)
      *** Configuring CMake...
      2025-02-02 01:58:31,074 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
      loading initial cache file C:\Users\Aryan\AppData\Local\Temp\tmp93d8axo6\build\CMakeInit.txt
      -- Building for: Visual Studio 17 2022
      -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631.
      -- The C compiler identification is MSVC 19.42.34436.0
      -- The CXX compiler identification is MSVC 19.42.34436.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.42.34433/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.43.0.windows.1")
      CMake Error at vendor/llama.cpp/CMakeLists.txt:104 (message):
        LLAMA_CUBLAS is deprecated and will be removed in the future.

        Use GGML_CUDA instead

      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:109 (llama_option_depr)


      -- Configuring incomplete, errors occurred!

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)

I have all the requirements installed, including the visual studio c++ development kit and the cuda toolkit.

It's currently 2 AM for me, I have lost my mind. I need help with this.

Feb 01 '25 20:02 RealUnrealGameDev

I've successfully built a CUDA wheel. Your environment seems correct.

I did not meet this error when I built this package. Here are some suggestions that maybe useful:

Try to build a CUDA binary of llama.cpp and verify it.
git clone the llama-cpp-python repository with submodule and try to build it (pip install build wheel then python -m build --wheel).
Install and test the wheel.

BTW, if someone need to build a wheel for CUDA<=12.4, it needs to install MSVC 2019 and set the environment variable manually:

$env:CMAKE_ARGS = "-DGGML_CUDA=ON -DCMAKE_GENERATOR_TOOLSET=v142,host=x64,version=14.29"

Feb 09 '25 07:02 la1ty

llama-cpp-python llama-cpp-python copied to clipboard

Cuda Build Failed, Please take a look at this

llama-cpp-python
llama-cpp-python copied to clipboard