[Bug] Building wheel for ktransformers (pyproject.toml) did not run successfully
检查清单
- [x] 1. 我已经搜索过相关问题,但未能获得预期的帮助
- [x] 2. 该问题在最新版本中尚未修复
- [ ] 3. 请注意,如果您提交的BUG相关 issue 缺少对应环境信息和最小可复现示例,我们将难以复现和定位问题,降低获得反馈的可能性
- [ ] 4. 如果您提出的不是bug而是问题,请在讨论区发起讨论 https://github.com/kvcache-ai/ktransformers/discussions。否则该 issue 将被关闭
- [ ] 5. 为方便社区交流,我将使用中文/英文或附上中文/英文翻译(如使用其他语言)。未附带翻译的非中文/英语内容可能会被关闭
问题描述
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] c++ -MMD -MF /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o.d -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/binding.cpp -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -std=c++17 [2/3] /usr/local/cuda-12.6/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o.d -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -Xcompiler -fPIC -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -std=c++17 /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(5): warning #177-D: variable "gptq_marlin::repack_stages" was declared but never referenced static constexpr int repack_stages = 8; ^
Remark: The warnings can be suppressed with "-diag-suppress
/home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(7): warning #177-D: variable "gptq_marlin::repack_threads" was declared but never referenced static constexpr int repack_threads = 256; ^
/home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.cu(10): warning #177-D: variable "gptq_marlin::tile_n_size" was declared but never referenced static constexpr int tile_n_size = tile_k_size * 4; ^
[3/3] /usr/local/cuda-12.6/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o.d -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include -I/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/usr/local/cuda-12.6/include -I/home/xstdsr1-test/miniconda3/envs/kt/include/python3.11 -c -c /home/xstdsr1-test/ktransformers/csrc/custom_marlin/gptq_marlin/gptq_marlin.cu -o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -Xcompiler -fPIC -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1016"' -DTORCH_EXTENSION_NAME=vLLMMarlin -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -std=c++17 g++ -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -fPIC -O2 -isystem /home/xstdsr1-test/miniconda3/envs/kt/include -pthread -B /home/xstdsr1-test/miniconda3/envs/kt/compiler_compat -shared /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/binding.o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin.o /home/xstdsr1-test/ktransformers/build/temp.linux-x86_64-cpython-311/csrc/custom_marlin/gptq_marlin/gptq_marlin_repack.o -L/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/lib -L/usr/local/cuda-12.6/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-cpython-311/vLLMMarlin.cpython-311-x86_64-linux-gnu.so -- Using compiler: /usr/bin/g++-13 CMake Error at CMakeLists.txt:39 (add_subdirectory): The source directory
/home/xstdsr1-test/ktransformers/third_party/prometheus-cpp
does not contain a CMakeLists.txt file.
-- xxHash build type: Release -- Architecture: x86_64 CMake Deprecation Warning at /home/xstdsr1-test/ktransformers/third_party/pybind11/CMakeLists.txt:13 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake.
Update the VERSION argument <min> value. Or, use the <min>...<max> syntax
to tell CMake that the project requires at least <min> but has been updated
to work with policies introduced by <max> or earlier.
-- pybind11 v2.14.0 dev1 -- Found PyTorch at: /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch -- PyTorch: CUDA detected: 12.6 -- PyTorch: CUDA nvcc is: /usr/local/cuda-12.6/bin/nvcc -- PyTorch: CUDA toolkit directory: /usr/local/cuda-12.6 -- PyTorch: Header version is: 12.6 CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message): Failed to compute shorthash for libnvrtc.so Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:63 (find_package)
CMake Warning (dev) at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/cmake/data/share/cmake-4.0/Modules/FindPackageHandleStandardArgs.cmake:430 (message):
The package name passed to find_package_handle_standard_args() (nvtx3) does
not match the name of the calling package (Caffe2). This can lead to
problems in calling code that expects find_package() result variables
(e.g., _FOUND) to follow a certain pattern.
Call Stack (most recent call first):
/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:178 (find_package_handle_standard_args)
/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:63 (find_package)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Could NOT find nvtx3 (missing: nvtx3_dir) CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:184 (message): Cannot find NVTX3, find old NVTX instead Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include) /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package) CMakeLists.txt:63 (find_package)
-- USE_CUDNN is set to 0. Compiling without cuDNN support -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support -- USE_CUDSS is set to 0. Compiling without cuDSS support -- USE_CUFILE is set to 0. Compiling without cuFile support -- Autodetected CUDA architecture(s): 6.1 -- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61 CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): static library kineto_LIBRARY-NOTFOUND not found. Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:125 (append_torchlib_if_found) CMakeLists.txt:63 (find_package)
-- Using aio -- Found PyTorch at: /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch CMake Warning at /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message): static library kineto_LIBRARY-NOTFOUND not found. Call Stack (most recent call first): /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:125 (append_torchlib_if_found) kvc2/CMakeLists.txt:53 (find_package)
CMake Warning (dev) at kvc2/CMakeLists.txt:58 (find_package): Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake --help-policy CMP0146" for policy details. Use the cmake_policy command to set the policy and suppress this warning.
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Error at kvc2/CMakeLists.txt:62 (message): prometheus-cpp::pull not found
-- Configuring incomplete, errors occurred!
CMake args: ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/xstdsr1-test/ktransformers/build/lib.linux-x86_64-cpython-311/', '-DPYTHON_EXECUTABLE=/home/xstdsr1-test/miniconda3/envs/kt/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DKTRANSFORMERS_USE_CUDA=ON', '-D_GLIBCXX_USE_CXX11_ABI=1']
CMake args: ['-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/xstdsr1-test/ktransformers/build/lib.linux-x86_64-cpython-311/', '-DPYTHON_EXECUTABLE=/home/xstdsr1-test/miniconda3/envs/kt/bin/python', '-DCMAKE_BUILD_TYPE=Release', '-DKTRANSFORMERS_USE_CUDA=ON', '-D_GLIBCXX_USE_CXX11_ABI=1', '-DLLAMA_NATIVE=ON', '-DEXAMPLE_VERSION_INFO=0.3.1+cu126torch27avx2']
build_temp: /home/xstdsr1-test/ktransformers/csrc/balance_serve/build
Traceback (most recent call last):
File "/home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in
× Building wheel for ktransformers (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip. full command: /home/xstdsr1-test/miniconda3/envs/kt/bin/python /home/xstdsr1-test/miniconda3/envs/kt/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmpgc49p76y cwd: /home/xstdsr1-test/ktransformers Building wheel for ktransformers (pyproject.toml) ... error ERROR: Failed building wheel for ktransformers Failed to build ktransformers ERROR: Failed to build installable wheels for some pyproject.toml based projects (ktransformers)
复现步骤
git clone https://github.com/kvcache-ai/ktransformers.git cd ktransformers git submodule update --init --recursive sudo env USE_BALANCE_SERVE=1 PYTHONPATH="$(which python)" PATH="$(dirname $(which python)):$PATH" bash ./install.sh
环境信息
CPU:Intel i7-8700 GPU:NVIDIA GeForce GTX 1080 MEM:24G OS:Ubuntu24.04-WSL2 on Win11
(kt) xstdsr1-test@DESKTOP-85TF6G2:~$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Tue_Oct_29_23:50:19_PDT_2024 Cuda compilation tools, release 12.6, V12.6.85 Build cuda_12.6.r12.6/compiler.35059454_0 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python --version Python 3.11.13 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ cmake --version cmake version 4.0.3
CMake suite maintained and supported by Kitware (kitware.com/cmake). (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python -c "import torch; print(torch.version)" 2.7.1+cu126 (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ python -c "import torch; print(torch.cuda.is_available())" True (kt) xstdsr1-test@DESKTOP-85TF6G2:~$ gcc --version gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
(kt) xstdsr1-test@DESKTOP-85TF6G2:~$ g++ --version g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
CPU_INSTRUCT=NATIVE USE_BALANCE_SERVE=1 USE_NUMA=1 TORCH_CUDA_ARCH_LIST="8.0;8.6;8.7;8.9;9.0;12.0" bash ./install.sh
I also encountered the same problem. Adding USE_BALANCE_SERVE=1 will cause the build to fail.
I'm facing the same issue. Adding USE_BALANCE_SERVE=1 causes the build to fail. Could anyone provide a solution?