No such file 'libtensorrt_llm.so' while building wheel
System Info
- CPU: x86_64
- GPU name: NVIDIA H100
Who can help?
No response
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
step 1. docker build
make -C docker build -> success
step 2. docker run
make -C docker run -> success
step 3. build wheel
python3 ./scripts/build_wheel.py --clean --trt_root /usr/local/tensorrt
- error trace
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- NVTX is disabled
-- Importing batch manager
-- Importing executor
-- Building PyTorch
-- Building Google tests
-- Building benchmarks
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- CUDA compiler: /usr/local/cuda/bin/nvcc
-- GPU architectures: 70-real;80-real;86-real;89-real;90-real
-- The C compiler identification is GNU 11.4.0
-- The CUDA compiler identification is NVIDIA 12.3.107
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.107")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CUDA library status:
-- version: 12.3.107
-- libraries: /usr/local/cuda/lib64
-- include path: /usr/local/cuda/targets/x86_64-linux/include
-- ========================= Importing and creating target nvinfer ==========================
-- Looking for library nvinfer
-- Library that was found /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/libnvinfer.so
-- ==========================================================================================
-- CUDAToolkit_VERSION 12.3 is greater or equal than 11.0, enable -DENABLE_BF16 flag
-- CUDAToolkit_VERSION 12.3 is greater or equal than 11.8, enable -DENABLE_FP8 flag
-- Found MPI_C: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1")
-- Found MPI_CXX: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- COMMON_HEADER_DIRS: /code/tensorrt_llm/cpp;/usr/local/cuda/include
-- Found Python3: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter Development Development.Module Development.Embed
-- USE_CXX11_ABI is set by python Torch to 1
-- TORCH_CUDA_ARCH_LIST: 7.0;8.0;8.6;8.9;9.0
CMake Warning at CMakeLists.txt:313 (message):
Ignoring environment variable TORCH_CUDA_ARCH_LIST=5.2 6.0 6.1 7.0 7.2 7.5
8.0 8.6 8.7 9.0+PTX
-- Found Python executable at /usr/bin/python3.10
-- Found Python libraries at /usr/lib/x86_64-linux-gnu
-- Found CUDA: /usr/local/cuda (found version "12.3")
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.107")
-- Caffe2: CUDA detected: 12.3
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 12.3
-- /usr/local/cuda-12.3/targets/x86_64-linux/lib/libnvrtc.so shorthash is e150bf88
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
-- Added CUDA NVCC flags for: -gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90
CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
CMakeLists.txt:346 (find_package)
-- Found Torch: /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch.so
-- TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
-- Building for TensorRT version: 9.2.0, library version: 9
-- Using MPI_C_INCLUDE_DIRS: /opt/hpcx/ompi/include;/opt/hpcx/ompi/include/openmpi;/opt/hpcx/ompi/include/openmpi/opal/mca/hwloc/hwloc201/hwloc/include;/opt/hpcx/ompi/include/openmpi/opal/mca/event/libevent2022/libevent;/opt/hpcx/ompi/include/openmpi/opal/mca/event/libevent2022/libevent/include
-- Using MPI_C_LIBRARIES: /opt/hpcx/ompi/lib/libmpi.so
running develop
/usr/local/lib/python3.10/dist-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` and ``easy_install``.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://github.com/pypa/setuptools/issues/917 for details.
********************************************************************************
!!
easy_install.initialize_options(self)
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
running egg_info
writing cutlass_library.egg-info/PKG-INFO
writing dependency_links to cutlass_library.egg-info/dependency_links.txt
writing top-level names to cutlass_library.egg-info/top_level.txt
reading manifest file 'cutlass_library.egg-info/SOURCES.txt'
adding license file 'LICENSE.txt'
writing manifest file 'cutlass_library.egg-info/SOURCES.txt'
running build_ext
Creating /root/.local/lib/python3.10/site-packages/cutlass-library.egg-link (link to .)
cutlass-library 3.4.0 is already the active version in easy-install.pth
Installed /code/tensorrt_llm/3rdparty/cutlass/python
Processing dependencies for cutlass-library==3.4.0
Finished processing dependencies for cutlass-library==3.4.0
-- MANUALLY APPENDING FLAG TO COMPILE FOR SM_90a.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Operating System: ubuntu, 22.04
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: /usr/local/lib/python3.10/dist-packages/pybind11/include (found version "2.11.1")
-- Found Python: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter
-- ========================= Importing and creating target nvonnxparser ==========================
-- Looking for library nvonnxparser
-- Library that was found /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/libnvonnxparser.so
-- ==========================================================================================
-- Configuring done (16.2s)
-- Generating done (4.0s)
-- Build files have been written to: /code/tensorrt_llm/cpp/build
Traceback (most recent call last):
File "/code/tensorrt_llm/./scripts/build_wheel.py", line 349, in <module>
main(**vars(args))
File "/code/tensorrt_llm/./scripts/build_wheel.py", line 198, in main
copy(
File "/usr/lib/python3.10/shutil.py", line 417, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/usr/lib/python3.10/shutil.py", line 254, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/code/tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so'
Expected behavior
Wheel is installed successfully
actual behavior
Wheel installation failed with following error message
Traceback (most recent call last):
File "/code/tensorrt_llm/./scripts/build_wheel.py", line 349, in <module>
main(**vars(args))
File "/code/tensorrt_llm/./scripts/build_wheel.py", line 198, in main
copy(
File "/usr/lib/python3.10/shutil.py", line 417, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/usr/lib/python3.10/shutil.py", line 254, in copyfile
with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/code/tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so'
additional notes
.
@Shixiaowei02 any updates on this?
@lifelongeeek, apologies for the very delayed response. I'm not sure if this issue is still relevant to you, but I confirmed that the latest version works without the problem. When you have a chance, could you try it out and see if it works on your end?
Issue has not received an update in over 14 days. Adding stale label.
This issue was closed because it has been 14 days without activity since it has been marked as stale.