ktransformers [Bug] Building wheel for ktransformers (pyproject.toml) did not run successfully

检查清单

[x] 1. 我已经搜索过相关问题，但未能获得预期的帮助
[x] 2. 该问题在最新版本中尚未修复
[ ] 3. 请注意，如果您提交的BUG相关 issue 缺少对应环境信息和最小可复现示例，我们将难以复现和定位问题，降低获得反馈的可能性
[ ] 4. 如果您提出的不是bug而是问题，请在讨论区发起讨论 https://github.com/kvcache-ai/ktransformers/discussions。否则该 issue 将被关闭
[ ] 5. 为方便社区交流，我将使用中文/英文或附上中文/英文翻译（如使用其他语言）。未附带翻译的非中文/英语内容可能会被关闭

问题描述

Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] c++ -MMD -MF /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o.d -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/binding.cpp -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17 [2/3] /usr/local/cuda-12.6/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o.d -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -Xcompiler -fPIC -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -std=c++17 /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(5): warning #177-D: variable "gptq_marlin::repack_stages" was declared but never referenced static constexpr int repack_stages = 8; ^

Remark: The warnings can be suppressed with "-diag-suppress "

/home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(7): warning #177-D: variable "gptq_marlin::repack_threads" was declared but never referenced static constexpr int repack_threads = 256; ^

/home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(10): warning #177-D: variable "gptq_marlin::tile_n_size" was declared but never referenced static constexpr int tile_n_size = tile_k_size * 4; ^

[3/3] /usr/local/cuda-12.6/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o.d -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -Xcompiler -fPIC -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -std=c++17 g++ -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -shared /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o -L/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/lib -L/usr/local/cuda-12.6/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-cpython-311/vLLMMarlin.cpython-311-x86_64-linux-gnu.so -- Using compiler: /usr/bin/g++-13 CMake Error at CMakeLists.txt:39 (add_subdirectory): The source directory

  /home/xstdsr1-test/ktransformers/third_party/prometheus-cpp

does not contain a CMakeLists.txt file.

-- xxHash build type: Release -- Architecture: x86_64 CMake Deprecation Warning at /home/xstdsr1-test/ktransformers/third_party/pybind11/CMakeLists.txt:13 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake.

Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
to tell CMake that the project requires at least <min> but has been updated
to work with policies introduced by <max> or earlier.

-- pybind11 v2.14.0 dev1 -- Found PyTorch at: /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch -- PyTorch: CUDA detected: 12.6 -- PyTorch: CUDA nvcc is: /usr/local/cuda-12.6/bin/nvcc -- PyTorch: CUDA toolkit directory: /usr/local/cuda-12.6 -- PyTorch: Header version is: 12.6 CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message): Failed to compute shorthash for libnvrtc.so Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:63 (find_package)

CMake Warning (dev) at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/cmake/data/share/cmake-4.0/Modules/FindPackageHandleStandardArgs.cmake:430 (message): The package name passed to find_package_handle_standard_args() (nvtx3) does not match the name of the calling package (Caffe2). This can lead to problems in calling code that expects find_package() result variables (e.g., _FOUND) to follow a certain pattern. Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:178 (find_package_handle_standard_args) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:63 (find_package) This warning is for project developers. Use -Wno-dev to suppress it.

-- Could NOT find nvtx3 (missing: nvtx3_dir) CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:184 (message): Cannot find NVTX3, find old NVTX instead Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:63 (find_package)

-- USE_CUDNN is set to 0. Compiling without cuDNN support -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support -- USE_CUDSS is set to 0. Compiling without cuDSS support -- USE_CUFILE is set to 0. Compiling without cuFile support -- Autodetected CUDA architecture(s): 6.1 -- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61 CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): static library kineto_LIBRARY-NOTFOUND not found. Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:125 (append_torchlib_if_found) CMakeLists.txt:63 (find_package)

-- Using aio -- Found PyTorch at: /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): static library kineto_LIBRARY-NOTFOUND not found. Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:125 (append_torchlib_if_found) kvc2/CMakeLists.txt:53 (find_package)

CMake Warning (dev) at kvc2/CMakeLists.txt:58 (find_package): Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake --help-policy CMP0146" for policy details. Use the cmake_policy command to set the policy and suppress this warning.

This warning is for project developers. Use -Wno-dev to suppress it.

CMake Error at kvc2/CMakeLists.txt:62 (message): prometheus-cpp::pull not found

-- Configuring incomplete, errors occurred! CMake args: ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/xstdsr1-test/ktransformers/build/lib.linux-x86_64-cpython-311/', '-DPYTHON_EXECUTABLE=/home/xstdsr1-test/miniconda3/envs/kt/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DKTRANSFORMERS_USE_CUDA=ON', '-D_GLIBCXX_USE_CXX11_ABI=1'] CMake args: ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/xstdsr1-test/ktransformers/build/lib.linux-x86_64-cpython-311/', '-DPYTHON_EXECUTABLE=/home/xstdsr1-test/miniconda3/envs/kt/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DKTRANSFORMERS_USE_CUDA=ON', '-D_GLIBCXX_USE_CXX11_ABI=1', '-DLLAMA_NATIVE=ON', '-DEXAMPLE_VERSION_INFO=0.3.1+cu126torch27avx2'] build_temp: /home/xstdsr1-test/ktransformers/csrc/balance_serve/build Traceback (most recent call last): File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in main() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main json_out["return_val"] = hook(**hook_input["kwargs"]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 280, in build_wheel return _build_backend().build_wheel( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/build_meta.py", line 438, in build_wheel return _build(['bdist_wheel', '--dist-info-dir', str(metadata_directory)]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/build_meta.py", line 426, in _build return self._build_with_temp_dir( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/build_meta.py", line 407, in _build_with_temp_dir self.run_setup() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/build_meta.py", line 320, in run_setup exec(code, locals()) File "", line 668, in File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/init.py", line 117, in setup return distutils.core.setup(**attrs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 186, in setup return run_commands(dist) ^^^^^^^^^^^^^^^^^^ File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 202, in run_commands dist.run_commands() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1002, in run_commands self.run_command(cmd) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/dist.py", line 1104, in run_command super().run_command(command) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "", line 263, in run File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/command/bdist_wheel.py", line 370, in run self.run_command("build") File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command self.distribution.run_command(command) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/dist.py", line 1104, in run_command super().run_command(command) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 357, in run_command self.distribution.run_command(command) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/dist.py", line 1104, in run_command super().run_command(command) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 1021, in run_command cmd_obj.run() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 99, in run _build_ext.run(self) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 368, in run self.build_extensions() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 1010, in build_extensions build_ext.build_extensions(self) File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 484, in build_extensions self._build_extensions_serial() File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 510, in _build_extensions_serial self.build_extension(ext) File "", line 591, in build_extension File "", line 370, in run_command_with_live_tail File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['cmake', '/home/xstdsr1-test/ktransformers/csrc/balance_serve', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/xstdsr1-test/ktransformers/build/lib.linux-x86_64-cpython-311/', '-DPYTHON_EXECUTABLE=/home/xstdsr1-test/miniconda3/envs/kt/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DKTRANSFORMERS_USE_CUDA=ON', '-D_GLIBCXX_USE_CXX11_ABI=1', '-DLLAMA_NATIVE=ON', '-DEXAMPLE_VERSION_INFO=0.3.1+cu126torch27avx2']' returned non-zero exit status 1. error: subprocess-exited-with-error

× Building wheel for ktransformers (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip. full command: /home/xstdsr1-test/miniconda3/envs/kt/bin/python /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpgc49p76y cwd: /home/xstdsr1-test/ktransformers Building wheel for ktransformers (pyproject.toml) ... error ERROR: Failed building wheel for ktransformers Failed to build ktransformers ERROR: Failed to build installable wheels for some pyproject.toml based projects (ktransformers)

复现步骤

git clone https://github.com/kvcache-ai/ktransformers.git cd ktransformers git submodule update --init --recursive sudo env USE_BALANCE_SERVE=1 PYTHONPATH="$(which python)" PATH="$(dirname $(which python)):$PATH" bash ./install.sh

环境信息

CPU：Intel i7-8700 GPU：NVIDIA GeForce GTX 1080 MEM：24G OS：Ubuntu24.04-WSL2 on Win11

(kt) xstdsr1-test@DESKTOP-85TF6G2:~$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Tue_Oct_29_23:50:19_PDT_2024 Cuda compilation tools, release 12.6, V12.6.85 Build cuda_12.6.r12.6/compiler.35059454_0 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python --version Python 3.11.13 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ cmake --version cmake version 4.0.3

CMake suite maintained and supported by Kitware (kitware.com/cmake). (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python -c "import torch; print(torch.version)" 2.7.1+cu126 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python -c "import torch; print(torch.cuda.is_available())" True (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(kt) xstdsr1-test@DESKTOP-85TF6G2:~$ g++ --version g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Jun 24 '25 08:06 Z27-DSR1

CPU_INSTRUCT=NATIVE USE_BALANCE_SERVE=1 USE_NUMA=1 TORCH_CUDA_ARCH_LIST="8.0;8.6;8.7;8.9;9.0;12.0" bash ./install.sh

Jul 16 '25 07:07 WalkerWen

I also encountered the same problem. Adding USE_BALANCE_SERVE=1 will cause the build to fail.

Jul 25 '25 09:07 shenyan-008

I'm facing the same issue. Adding USE_BALANCE_SERVE=1 causes the build to fail. Could anyone provide a solution?

Aug 25 '25 01:08 corengh