Please release a cuda build for v0.3.5
Hi there. I see there is a metal build for v0.3.5. Would you please releasze a cuda version?
Best regards
Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.
Agree. I've manually built a CUDA version, but an official prebuilt release should be convenient for most users.
how? is there reference for manual build ?
The latest version is v0.3.7. You can follow the steps in the CI workflow.
For Windows users, here is my two cents:
- (Optional) Uninstall all MinGW tools (clang, gcc, etc.). Install
everything. - Install Visual Studio 2022 with MSVC 2022, Cmake and Windows SDK. If you need to build it with CUDA<12.4, you should also install MSVC 2019. (You may need to add the directory of
cmake.exeto PATH manually. Make sure when you call cmake in powershell it uses the VS version cmake.exe.) - Install CUDA.
- Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. (
everythingmay be useful.) - Git clone the repository with submodule llama.cpp.
- Activate the Python environment and run the following commands in PowerShell:
$env:CMAKE_ARGS = "-DGGML_CUDA=ON"
python -m pip install build wheel
python -m build --wheel
If you need to build it with CUDA<12.4, use MSVC 2019:
$env:CMAKE_ARGS = "-DGGML_CUDA=ON -DCMAKE_GENERATOR_TOOLSET=v142,host=x64,version=14.29"
python -m pip install build wheel
python -m build --wheel
@abetlen would you please add the workflow suggested by @la1ty to automate the generation of the builds as you release now versions?
+1 for pre-built whl's
@ZiyaCu @ParisNeo @la1ty , Check out this repo: textgen-webui release includes llama-cpp-python CUDA wheels.
The only downside is that these wheels can't be imported using import llama_cpp. Instead, you should use import llama_cpp_cuda or import llama_cpp_cuda_tensorcore, depending on the wheel you installed.
You can find the wheels in the requirements file: 🔗 Requirements.txt
Or check the full release here: 🔗 llama-cpp-python-cuBLAS-wheels Release
@Amrabdelhamed611 thanks alot. I'll take a look. I am using this in lollms which sould work on all kinds of systems, and it is a real pain having to write custom code for every configuration.
PS E:\llama-cpp-python> conda activate CUDA125-py312 (CUDA125-py312) PS E:\llama-cpp-python> $env:CMAKE_ARGS = "-DGGML_CUDA=ON" (CUDA125-py312) PS E:\llama-cpp-python> python -m pip install build wheel Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple/, http://mirrors.aliyun.com/pypi/simple/ Collecting build Downloading http://mirrors.aliyun.com/pypi/packages/84/c2/80633736cd183ee4a62107413def345f7e6e3c01563dbca1417363cf957e/build-1.2.2.post1-py3-none-any.whl (22 kB) Requirement already satisfied: wheel in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (0.45.1) Requirement already satisfied: packaging>=19.1 in d:\software\minipy312\envs\cuda125-py312\lib\site-packages (from build) (24.2) Collecting pyproject_hooks (from build) Downloading http://mirrors.aliyun.com/pypi/packages/bd/24/12818598c362d7f300f18e74db45963dbcb85150324092410c8b49405e42/pyproject_hooks-1.2.0-py3-none-any.whl (10 kB) Collecting colorama (from build) Downloading http://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB) Installing collected packages: pyproject_hooks, colorama, build Successfully installed build-1.2.2.post1 colorama-0.4.6 pyproject_hooks-1.2.0 (CUDA125-py312) PS E:\llama-cpp-python> python -m build --wheel
- Creating isolated environment: venv+pip...
- Installing packages in isolated environment:
- scikit-build-core[pyproject]>=0.9.2
- Getting build dependencies for wheel...
- Building wheel... *** scikit-build-core 0.10.7 using CMake 3.31.4 (wheel) *** Configuring CMake... 2025-02-22 02:34:20,699 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None loading initial cache file C:\Users\ADMINI~1\AppData\Local\Temp\tmpui8yd0_s\build\CMakeInit.txt -- Building for: Visual Studio 17 2022 -- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.26100. -- The C compiler identification is MSVC 19.43.34808.0 -- The CXX compiler identification is MSVC 19.43.34808.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Professional/VC/Tools/MSVC/14.43.34808/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: C:/Program Files/Git/cmd/git.exe (found version "2.47.1.windows.2") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - not found -- Found Threads: TRUE -- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: AMD64 -- CMAKE_GENERATOR_PLATFORM: x64 -- Including CPU backend -- Found OpenMP_C: -openmp (found version "2.0") -- Found OpenMP_CXX: -openmp (found version "2.0") -- Found OpenMP: TRUE (found version "2.0") -- x86 detected -- Performing Test HAS_AVX_1 -- Performing Test HAS_AVX_1 - Success -- Performing Test HAS_AVX2_1 -- Performing Test HAS_AVX2_1 - Success -- Performing Test HAS_FMA_1 -- Performing Test HAS_FMA_1 - Success -- Performing Test HAS_AVX512_1 -- Performing Test HAS_AVX512_1 - Failed -- Performing Test HAS_AVX512_2 -- Performing Test HAS_AVX512_2 - Failed -- Adding CPU backend variant ggml-cpu: /arch:AVX2 GGML_AVX2;GGML_FMA;GGML_F16C -- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.5/include (found version "12.5.82") -- CUDA Toolkit found -- Using CUDA architectures: native CMake Error at D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:614 (message): No CUDA toolset found. Call Stack (most recent call first): D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD) D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test) D:/software/Minipy312/envs/CUDA125-py312/Lib/site-packages/cmake/data/share/cmake-3.31/Modules/CMakeDetermineCUDACompiler.cmake:131 (CMAKE_DETERMINE_COMPILER_ID) vendor/llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:25 (enable_language)
-- Configuring incomplete, errors occurred!
*** CMake configuration failed
ERROR Backend subprocess exited when trying to invoke build_wheel
@dw5189 There are two possible causes I guess:
- Make sure you are using the VS version
cmake.exeto compile this project. I runcmake --versionin Powershell and it returnscmake version 3.29.5-msvc4. (I tried MinGW version and it failed. But currently the log seems normal, so good luck.) - Copy the four files from CUDA MSBuildExtensions directory to VS BuildCustomizations directory. If you don't know how to do it, just search
No CUDA toolset foundin any web search engine and it should return plenty of pages with details.