llama-cpp-python Failed building wheel for llama-cpp-python

Hello, I cannot install llama-cpp-python on Ubuntu 24.04.2, CUDA 12.4, python 3.12. Got issue copied below (I've deleted parts that were successful):

(llms) patryk@patryk-MS-7E16:~/llama-cpp-python$ pip install llama-cpp-python
--extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124 Looking in indexes: https://pypi.org/simple, https://abetlen.github.io/llama-cpp-python/whl/cu124 Collecting llama-cpp-python Using cached llama_cpp_python-0.3.7.tar.gz (66.7 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [154 lines of output] *** scikit-build-core 0.10.7 using CMake 3.28.3 (wheel) *** Configuring CMake... loading initial cache file /tmp/tmpmijf0b_7/build/CMakeInit.txt -- The C compiler identification is GNU 13.3.0 -- The CXX compiler identification is GNU 13.3.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/gcc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/g++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.43.0") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- Including CPU backend -- Found OpenMP_C: -fopenmp (found version "4.5") -- Found OpenMP_CXX: -fopenmp (found version "4.5") -- Found OpenMP: TRUE (found version "4.5") -- x86 detected -- Adding CPU backend variant ggml-cpu: -march=native CMake Warning at vendor/llama.cpp/ggml/CMakeLists.txt:285 (message): GGML build version fixed at 1 likely due to a shallow clone.

  CMake Warning (dev) at CMakeLists.txt:13 (install):
    Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  Call Stack (most recent call first):
    CMakeLists.txt:97 (llama_cpp_python_install_target)
  This warning is for project developers.  Use -Wno-dev to suppress it.
  
  CMake Warning (dev) at CMakeLists.txt:21 (install):
    Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  Call Stack (most recent call first):
    CMakeLists.txt:97 (llama_cpp_python_install_target)
  This warning is for project developers.  Use -Wno-dev to suppress it.
  
  CMake Warning (dev) at CMakeLists.txt:13 (install):
    Target ggml has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  Call Stack (most recent call first):
    CMakeLists.txt:98 (llama_cpp_python_install_target)
  This warning is for project developers.  Use -Wno-dev to suppress it.
  
  CMake Warning (dev) at CMakeLists.txt:21 (install):
    Target ggml has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
  Call Stack (most recent call first):
    CMakeLists.txt:98 (llama_cpp_python_install_target)
  This warning is for project developers.  Use -Wno-dev to suppress it.
  
  -- Configuring done (0.4s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/tmpmijf0b_7/build
  *** Building project with Ninja...
  Change Dir: '/tmp/tmpmijf0b_7/build'
  
  Run Build Command(s): ninja -v
  [58/60] : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  FAILED: vendor/llama.cpp/examples/llava/llama-llava-cli
  : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-llava-cli.dir/llava-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-llava-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: warning: libgomp.so.1, needed by bin/libggml-cpu.so, not found (try using -rpath or -rpath-link)
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_barrier@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_parallel@GOMP_4.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_thread_num@OMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_single_start@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_num_threads@OMP_1.0'
  collect2: error: ld returned 1 exit status
  [59/60] : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-qwen2vl-cli.dir/qwen2vl-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-qwen2vl-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  FAILED: vendor/llama.cpp/examples/llava/llama-qwen2vl-cli
  : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-qwen2vl-cli.dir/qwen2vl-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-qwen2vl-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: warning: libgomp.so.1, needed by bin/libggml-cpu.so, not found (try using -rpath or -rpath-link)
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_barrier@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_parallel@GOMP_4.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_thread_num@OMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_single_start@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_num_threads@OMP_1.0'
  collect2: error: ld returned 1 exit status
  [60/60] : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-minicpmv-cli.dir/minicpmv-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-minicpmv-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  FAILED: vendor/llama.cpp/examples/llava/llama-minicpmv-cli
  : && /usr/bin/g++  -pthread -B /home/patryk/miniconda3/envs/llms/compiler_compat -O3 -DNDEBUG  vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/llava.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llava.dir/clip.cpp.o vendor/llama.cpp/examples/llava/CMakeFiles/llama-minicpmv-cli.dir/minicpmv-cli.cpp.o -o vendor/llama.cpp/examples/llava/llama-minicpmv-cli  -Wl,-rpath,/tmp/tmpmijf0b_7/build/bin:  vendor/llama.cpp/common/libcommon.a  bin/libllama.so  bin/libggml.so  bin/libggml-cpu.so  bin/libggml-base.so && :
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: warning: libgomp.so.1, needed by bin/libggml-cpu.so, not found (try using -rpath or -rpath-link)
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_barrier@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_parallel@GOMP_4.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_thread_num@OMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `GOMP_single_start@GOMP_1.0'
  /home/patryk/miniconda3/envs/llms/compiler_compat/ld: bin/libggml-cpu.so: undefined reference to `omp_get_num_threads@OMP_1.0'
  collect2: error: ld returned 1 exit status
  ninja: build stopped: subcommand failed.
  
  
  *** CMake build failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Failed to build installable wheels for some pyproject.toml based projects (llama-cpp-python)

Feb 12 '25 11:02 pklochowicz

Seeing the same here for cuda 12.3

Feb 12 '25 16:02 cesarandreslopez

@pklochowicz in case it's useful for you, this will work with CUDA support:

RUN CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python==v0.3.5

Feb 12 '25 16:02 cesarandreslopez

I found solution in other issue (https://github.com/abetlen/llama-cpp-python/issues/1573), this install command worked for me: CMAKE_ARGS="-DGGML_CUDA=on -DLLAVA_BUILD=off" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

Feb 12 '25 22:02 pklochowicz

I found solution in other issue (#1573), this install command worked for me: CMAKE_ARGS="-DGGML_CUDA=on -DLLAVA_BUILD=off" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir

Solved the pip install , but why?

May 21 '25 05:05 RobinQu

This worked for me:

export CC=/usr/bin/gcc
export CXX=/usr/bin/g++
export CMAKE_C_COMPILER=/usr/bin/gcc
export CMAKE_CXX_COMPILER=/usr/bin/g++
export OpenMP_C_FLAGS="-fopenmp"
export OpenMP_CXX_FLAGS="-fopenmp"
export OpenMP_C_LIB_NAMES="gomp"
export OpenMP_CXX_LIB_NAMES="gomp"
export OpenMP_gomp_LIBRARY="/usr/lib/x86_64-linux-gnu/libgomp.so.1"
export CMAKE_ARGS="-DOpenMP_C_FLAGS=-fopenmp -DOpenMP_CXX_FLAGS=-fopenmp -DOpenMP_C_LIB_NAMES=gomp -DOpenMP_CXX_LIB_NAMES=gomp -DOpenMP_gomp_LIBRARY=/usr/lib/x86_64-linux-gnu/libgomp.so.1"
pip install llama-cpp-python

Jul 25 '25 02:07 parkitny

In my case I need to find first the libgomp.so.1 by a cmd: find /usr -name libgomp.so.1 and then use that path in exporting vars

Nov 08 '25 12:11 glad4enkonm

llama-cpp-python llama-cpp-python copied to clipboard

Failed building wheel for llama-cpp-python

llama-cpp-python
llama-cpp-python copied to clipboard