llama-cpp-python ERROR: Failed building wheel for llama-cpp-python for SYCL installation on Windows

Hardware: CPU Intel 14900K, GPU Intel arc a770 Software: Win 11 Pro

(ipex) PS C:\Users\sunil> $env:CMAKE_ARGS=" -DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx"
(ipex) PS C:\Users\sunil> pip install --upgrade --force-reinstall --no-cache-dir  llama-cpp-python 
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [48 lines of output]
      *** scikit-build-core 0.9.8 using CMake 3.30.1 (wheel)
      *** Configuring CMake...
      2024-07-22 10:35:57,085 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
      loading initial cache file C:\Users\sunil\AppData\Local\Temp\tmpj6iqbcsh\build\CMakeInit.txt
      -- Building for: Visual Studio 17 2022
      -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.22631.
      -- The C compiler identification is MSVC 19.39.33522.0
      -- The CXX compiler identification is MSVC 19.39.33522.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.39.33519/bin/Hostx64/x64/cl.exe - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2022/BuildTools/VC/Tools/MSVC/14.39.33519/bin/Hostx64/x64/cl.exe - skipped       
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.43.0.windows.1")
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Looking for pthread_create in pthreads
      -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread
      -- Looking for pthread_create in pthread - not found
      -- Found Threads: TRUE
      -- Found OpenMP_C: -openmp (found version "2.0")
      -- Found OpenMP_CXX: -openmp (found version "2.0")
      -- Found OpenMP: TRUE (found version "2.0")
      -- OpenMP found
      -- Using ggml SGEMM
      CMake Error at vendor/llama.cpp/ggml/src/CMakeLists.txt:471 (find_package):
        Found package configuration file:

          C:/Program Files (x86)/Intel/oneAPI/compiler/2024.1/lib/cmake/IntelSYCL/IntelSYCLConfig.cmake

        but it set IntelSYCL_FOUND to FALSE so package "IntelSYCL" is considered to
        be NOT FOUND.  Reason given by package:

        Unsupported compiler family MSVC and compiler C:/Program Files
        (x86)/Microsoft Visual
        Studio/2022/BuildTools/VC/Tools/MSVC/14.39.33519/bin/Hostx64/x64/cl.exe!!



      -- Configuring incomplete, errors occurred!

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Jul 22 '24 15:07 sunilmathew-mcw

I think @abetlen should precompile a windows version as he did on linux. Windows is such a mess when compiling python modules that use c++ code.

Jul 22 '24 22:07 ParisNeo

Hi @abetlen, can I assist you in anyway to make this possible? Thank you, Sunil.

Jul 24 '24 14:07 sunilmathew-mcw

I meet the same issue, have you solved this?

Jul 25 '24 15:07 kylo5aby

Hi @kylo5aby, unfortunately no

Jul 25 '24 15:07 sunilmathew-mcw

The last version I was able to build SYCL wheel was v0.2.44. Something broke somewhere after that version and I can't build it anymore.

Jul 26 '24 17:07 BeamFain

I have found the same issue today

Aug 09 '24 02:08 jthomazini

Hi! I also ran into this issue a few days ago. Is there a workaround for that? Out of desperation I already tried thowing the SYCL-flavored libraries from llama-cpp and their dependencies into the lib folder of the venv, but without success.

Aug 09 '24 10:08 Agilov1

Hi! I also ran into this issue a few days ago. Is there a workaround for that? Out of desperation I already tried thowing the SYCL-flavored libraries from llama-cpp and their dependencies into the lib folder of the venv, but without success.

I was unable to proceed with this one. However I went back to llama.cpp for SYCL - https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md

and followed the instructions to build the llama.cpp locally and now I am able to run the LLMs on my ACR 770 and it is running greate.

Aug 10 '24 01:08 jthomazini

Hi! I also ran into this issue a few days ago. Is there a workaround for that? Out of desperation I already tried thowing the SYCL-flavored libraries from llama-cpp and their dependencies into the lib folder of the venv, but without success.

I was unable to proceed with this one. However I went back to llama.cpp for SYCL - https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md

and followed the instructions to build the llama.cpp locally and now I am able to run the LLMs on my ACR 770 and it is running greate.

Hi and thanks for the reply! Do you use "plain" llama.cpp for inferencing? For what I'm doing I have to have it integrated into an python project. Therefor I used llama-cpp-python so far with just CPU inferncing. But since I also need it working on ARC GPUs I was wondering if there is already someone who managed to do so. (Btw. I'm building on Windows11 with VS2022, CMake and the oneAPI Toolkit installed)

Aug 10 '24 18:08 Agilov1

Hi! I also ran into this issue a few days ago. Is there a workaround for that? Out of desperation I already tried thowing the SYCL-flavored libraries from llama-cpp and their dependencies into the lib folder of the venv, but without success.

I was unable to proceed with this one. However I went back to llama.cpp for SYCL - https://github.com/ggerganov/llama.cpp/blob/master/docs/backend/SYCL.md and followed the instructions to build the llama.cpp locally and now I am able to run the LLMs on my ACR 770 and it is running greate.

Hi and thanks for the reply! Do you use "plain" llama.cpp for inferencing? For what I'm doing I have to have it integrated into an python project. Therefor I used llama-cpp-python so far with just CPU inferncing. But since I also need it working on ARC GPUs I was wondering if there is already someone who managed to do so. (Btw. I'm building on Windows11 with VS2022, CMake and the oneAPI Toolkit installed)

I am a noob so, I am just learning this stuff. I am running with llama.cpp with sycl on A770 with gguf models and so far it is running great. sorry I can't help you with your python question; I could not build it when I tried.

Aug 11 '24 22:08 jthomazini

SYCL with intel gpu and igpu run with some models on llama cpp

and not run on llama cpp python

why?

Aug 28 '24 00:08 ayttop

Hi All,

The only workaround for this is by first compiling llama-cpp-python CXX using icx and then allowing pip to compile using cl

git clone https://github.com/abetlen/llama-cpp-python.git cd vendor git clone https://github.com/ggml-org/llama.cpp.git cd ../ cmake -B build -G "Ninja" -DGGML_SYCL=ON -DCMAKE_C_COMPILER=cl -DCMAKE_CXX_COMPILER=icx -DCMAKE_BUILD_TYPE=Release cd build cmake --build . cd .. pip install .

Finally manually copy the dlls from llama-cpp-python/build/bin/ to your Python site-packages/llama_cpp/lib

Wondering what could be the issue with scikit-build-core in not honouring the CMAKE environment variables in Windows.

PS. Make sure you use the latest VS Build tools (2022) and Intel oneAPI SDK. The latest Intel oneAPI SDK only allows x64 builds.

Mar 07 '25 08:03 tesnorindian