TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

No such file 'libtensorrt_llm.so' while building wheel

Open lifelongeeek opened this issue 1 year ago • 1 comments

System Info

  • CPU: x86_64
  • GPU name: NVIDIA H100

Who can help?

No response

Information

  • [X] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

step 1. docker build

make -C docker build -> success

step 2. docker run

make -C docker run -> success

step 3. build wheel

python3 ./scripts/build_wheel.py --clean --trt_root /usr/local/tensorrt

  • error trace
-- The CXX compiler identification is GNU 11.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- NVTX is disabled
-- Importing batch manager
-- Importing executor
-- Building PyTorch
-- Building Google tests
-- Building benchmarks
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - /usr/local/cuda/bin/nvcc
-- CUDA compiler: /usr/local/cuda/bin/nvcc
-- GPU architectures: 70-real;80-real;86-real;89-real;90-real
-- The C compiler identification is GNU 11.4.0
-- The CUDA compiler identification is NVIDIA 12.3.107
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.107") 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- CUDA library status:
--     version: 12.3.107
--     libraries: /usr/local/cuda/lib64
--     include path: /usr/local/cuda/targets/x86_64-linux/include
-- ========================= Importing and creating target nvinfer ==========================
-- Looking for library nvinfer
-- Library that was found /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/libnvinfer.so
-- ==========================================================================================
-- CUDAToolkit_VERSION 12.3 is greater or equal than 11.0, enable -DENABLE_BF16 flag
-- CUDAToolkit_VERSION 12.3 is greater or equal than 11.8, enable -DENABLE_FP8 flag
-- Found MPI_C: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1") 
-- Found MPI_CXX: /opt/hpcx/ompi/lib/libmpi.so (found version "3.1") 
-- Found MPI: TRUE (found version "3.1")  
-- COMMON_HEADER_DIRS: /code/tensorrt_llm/cpp;/usr/local/cuda/include
-- Found Python3: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter Development Development.Module Development.Embed 
-- USE_CXX11_ABI is set by python Torch to 1
-- TORCH_CUDA_ARCH_LIST: 7.0;8.0;8.6;8.9;9.0
CMake Warning at CMakeLists.txt:313 (message):
  Ignoring environment variable TORCH_CUDA_ARCH_LIST=5.2 6.0 6.1 7.0 7.2 7.5
  8.0 8.6 8.7 9.0+PTX


-- Found Python executable at /usr/bin/python3.10
-- Found Python libraries at /usr/lib/x86_64-linux-gnu
-- Found CUDA: /usr/local/cuda (found version "12.3") 
-- Found CUDAToolkit: /usr/local/cuda/include (found version "12.3.107") 
-- Caffe2: CUDA detected: 12.3
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 12.3
-- /usr/local/cuda-12.3/targets/x86_64-linux/lib/libnvrtc.so shorthash is e150bf88
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
-- Added CUDA NVCC flags for: -gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90
CMake Warning at /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
  static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
  /usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
  CMakeLists.txt:346 (find_package)


-- Found Torch: /usr/local/lib/python3.10/dist-packages/torch/lib/libtorch.so  
-- TORCH_CXX_FLAGS: -D_GLIBCXX_USE_CXX11_ABI=1
-- Building for TensorRT version: 9.2.0, library version: 9
-- Using MPI_C_INCLUDE_DIRS: /opt/hpcx/ompi/include;/opt/hpcx/ompi/include/openmpi;/opt/hpcx/ompi/include/openmpi/opal/mca/hwloc/hwloc201/hwloc/include;/opt/hpcx/ompi/include/openmpi/opal/mca/event/libevent2022/libevent;/opt/hpcx/ompi/include/openmpi/opal/mca/event/libevent2022/libevent/include
-- Using MPI_C_LIBRARIES: /opt/hpcx/ompi/lib/libmpi.so
running develop
/usr/local/lib/python3.10/dist-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` and ``easy_install``.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://github.com/pypa/setuptools/issues/917 for details.
        ********************************************************************************

!!
  easy_install.initialize_options(self)
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!

        ********************************************************************************
        Please avoid running ``setup.py`` directly.
        Instead, use pypa/build, pypa/installer or other
        standards-based tools.

        See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
        ********************************************************************************

!!
  self.initialize_options()
running egg_info
writing cutlass_library.egg-info/PKG-INFO
writing dependency_links to cutlass_library.egg-info/dependency_links.txt
writing top-level names to cutlass_library.egg-info/top_level.txt
reading manifest file 'cutlass_library.egg-info/SOURCES.txt'
adding license file 'LICENSE.txt'
writing manifest file 'cutlass_library.egg-info/SOURCES.txt'
running build_ext
Creating /root/.local/lib/python3.10/site-packages/cutlass-library.egg-link (link to .)
cutlass-library 3.4.0 is already the active version in easy-install.pth

Installed /code/tensorrt_llm/3rdparty/cutlass/python
Processing dependencies for cutlass-library==3.4.0
Finished processing dependencies for cutlass-library==3.4.0
-- MANUALLY APPENDING FLAG TO COMPILE FOR SM_90a.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Operating System: ubuntu, 22.04
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: /usr/local/lib/python3.10/dist-packages/pybind11/include (found version "2.11.1")
-- Found Python: /usr/bin/python3.10 (found version "3.10.12") found components: Interpreter 
-- ========================= Importing and creating target nvonnxparser ==========================
-- Looking for library nvonnxparser
-- Library that was found /usr/local/tensorrt/targets/x86_64-linux-gnu/lib/libnvonnxparser.so
-- ==========================================================================================
-- Configuring done (16.2s)
-- Generating done (4.0s)
-- Build files have been written to: /code/tensorrt_llm/cpp/build
Traceback (most recent call last):
  File "/code/tensorrt_llm/./scripts/build_wheel.py", line 349, in <module>
    main(**vars(args))
  File "/code/tensorrt_llm/./scripts/build_wheel.py", line 198, in main
    copy(
  File "/usr/lib/python3.10/shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/code/tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so'

Expected behavior

Wheel is installed successfully

actual behavior

Wheel installation failed with following error message

Traceback (most recent call last):
  File "/code/tensorrt_llm/./scripts/build_wheel.py", line 349, in <module>
    main(**vars(args))
  File "/code/tensorrt_llm/./scripts/build_wheel.py", line 198, in main
    copy(
  File "/usr/lib/python3.10/shutil.py", line 417, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.10/shutil.py", line 254, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: '/code/tensorrt_llm/cpp/build/tensorrt_llm/libtensorrt_llm.so'

additional notes

.

lifelongeeek avatar Mar 02 '24 07:03 lifelongeeek

@Shixiaowei02 any updates on this?

poweiw avatar May 16 '25 20:05 poweiw

@lifelongeeek, apologies for the very delayed response. I'm not sure if this issue is still relevant to you, but I confirmed that the latest version works without the problem. When you have a chance, could you try it out and see if it works on your end?

karljang avatar Sep 22 '25 16:09 karljang

Issue has not received an update in over 14 days. Adding stale label.

github-actions[bot] avatar Oct 07 '25 03:10 github-actions[bot]

This issue was closed because it has been 14 days without activity since it has been marked as stale.

github-actions[bot] avatar Oct 21 '25 03:10 github-actions[bot]