Prerequisites

Please answer the following questions for yourself before submitting an issue.

[x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Expected it to build.

Current Behavior

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

Fails at this point:

      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      -- Found CUDAToolkit: /usr/local/cuda/include (found version "12.2.128")
      -- cuBLAS found
      -- The CUDA compiler identification is unknown
      CMake Error at /tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:603 (message):
        Failed to detect a default CUDA architecture.
      
      
      
        Compiler output:
      
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:250 (enable_language)
      
      
      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "/tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 666, in setup
          env = cmkr.configure(
        File "/tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 357, in configure
          raise SKBuildError(msg)
      
      An error occurred while configuring with CMake.
        Command:
          /tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-07gczwgt/llama-cpp-python_dac2049bbf404ad88046ca7ba38e3fdb -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-07gczwgt/llama-cpp-python_dac2049bbf404ad88046ca7ba38e3fdb/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.12 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPYTHON_LIBRARY:PATH=/usr/lib/x86_64-linux-gnu/libpython3.10.so -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_NumPy_INCLUDE_DIRS:PATH=/usr/lib/python3/dist-packages/numpy/core/include -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_NumPy_INCLUDE_DIRS:PATH=/usr/lib/python3/dist-packages/numpy/core/include -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-vyy1n26b/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_CUBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_CUBLAS=on
        Source directory:
          /tmp/pip-install-07gczwgt/llama-cpp-python_dac2049bbf404ad88046ca7ba38e3fdb
        Working directory:
          /tmp/pip-install-07gczwgt/llama-cpp-python_dac2049bbf404ad88046ca7ba38e3fdb/_skbuild/linux-x86_64-3.10/cmake-build
      Please see CMake's output for more information.
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Environment and Context

Ubuntu 22.04, Intel CPU, 64GB Ram and 3060 GPU with latest nvidia drivers (535.86.10) and cuda ( 12.2 ) installed via apt.

╰─⠠⠵ lscpu                                                                                                                                    on master|✚1…3
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  12
  On-line CPU(s) list:   0-11
Vendor ID:               GenuineIntel
  Model name:            11th Gen Intel(R) Core(TM) i5-11600K @ 3.90GHz
    CPU family:          6
    Model:               167
    Thread(s) per core:  2
    Core(s) per socket:  6
    Socket(s):           1
    Stepping:            1
    CPU max MHz:         4900,0000
    CPU min MHz:         800,0000
    BogoMIPS:            7824.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx p
                         dpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclm
                         ulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer 
                         aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi fl
                         expriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap avx512ifma clfl
                         ushopt intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_wind
                         ow hwp_epp hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid 
                         fsrm md_clear flush_l1d arch_capabilities
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   288 KiB (6 instances)
  L1i:                   192 KiB (6 instances)
  L2:                    3 MiB (6 instances)
  L3:                    12 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-11
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
  Retbleed:              Mitigation; Enhanced IBRS
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Linux aquarelle 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

╰─⠠⠵ python3 --version                                                                                                                        on master|✚1…3
Python 3.10.12
╭─arthur at aquarelle in ~/dev/ai/llama.cpp/build on master✘✘✘ 23-08-22 - 18:22:24
╰─⠠⠵ make --version                                                                                                                           on master|✚1…3
GNU Make 4.3
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
╭─arthur at aquarelle in ~/dev/ai/llama.cpp/build on master✘✘✘ 23-08-22 - 18:22:28
╰─⠠⠵ g++ --version                                                                                                                            on master|✚1…3
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Steps to Reproduce

step 1

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir

Aug 22 '23 16:08 arthurwolf

My recommendation would be to try Cuda 11.8, I have had problems with other installations using Cuda 12 in the past working with LLMs. With Cuda 11.7 and 11.8 I have had no issues. I currently got everything installed with Cuda 11.8 also on Ubuntu using python 3.10.

Aug 23 '23 12:08 Fi-711

@arthurwolf You can try building using the following, it worked for me.

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

Sep 17 '23 15:09 m-from-space

@arthurwolf You can try building using the following, it worked for me.

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

Thanks it finally worked for me in WSL2

Oct 02 '23 17:10 avatsaev

@arthurwolf You can try building using the following, it worked for me.

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

Thanks, it worked

Nov 03 '23 15:11 BennisonDevadoss

Worked for me with small corrections. My CUDA Version: 12.2 . I use poetry env manager

CUDACXX=/usr/local/cuda-12.0/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 poetry run pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

Mar 05 '24 09:03 Anna-Pinewood

@arthurwolf You can try building using the following, it worked for me.

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

This works for me too! thank you very much!

Apr 22 '24 16:04 hyusterr

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

thanks it works out in ubuntu22.4

May 02 '24 05:05 JimmyJIA-02

@arthurwolf

I have Cuda 12.0 on Ubuntu 22.04 and it works perfectly with your patch. Thanks

May 05 '24 00:05 GabriIT

@arthurwolf You can try building using the following, it worked for me.

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

Thanks, the CUDACXX environmental variable was all I needed.

May 22 '24 21:05 AmericanPresidentJimmyCarter

llama-cpp-python llama-cpp-python copied to clipboard

Failed to detect a default CUDA architecture

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Steps to Reproduce

llama-cpp-python
llama-cpp-python copied to clipboard