TensorRT-LLM
TensorRT-LLM copied to clipboard
Build TensorRT-LLM occur error without container
Does Tensorrt-LLM have to use cuda12.2? Can I build it with cuda12.0 without the container? Here is my environment information:
- Cuda version: 12.0
- TensorRT version: 9.2.0.5
- Cudnn version: 8.9.7.29
- NCCL version: 2.18.3-1
Running command "python3 scripts/build_wheel.py --clean --trt_root /home/jieye/viper2/libraries/TensorRT-9.2.0.5.Linux.x86_64-gnu.cuda-12.2.cudnn8.9 --nccl_root /soft/libraries/nccl/nccl_2.18.3-1+cuda12.2_x86_64 --cudnn_root /home/jieye/viper2/libraries/cudnn-12-linux-x64-v8.9.7.29" reports the following errors:
ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas fatal : Ptx assembly aborted due to errors gmake[3]: *** [tensorrt_llm/common/CMakeFiles/common_src.dir/build.make:217: tensorrt_llm/common/CMakeFiles/common_src.dir/cudaFp8Utils.cu.o] Error 255 gmake[3]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingCommon.cu.o [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingKernels.cu.o gmake[2]: *** [CMakeFiles/Makefile2:780: tensorrt_llm/common/CMakeFiles/common_src.dir/all] Error 2 gmake[2]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/gptKernels.cu.o
BTW, Does Tensorrt-LLM have to use openmpi? Can I build it with mpich or cray-mpich? When I build TensorRT-LLM without using openmpi, it always report the following error:
[100%] Built target tensorrt_llm_static
[100%] Linking CXX shared library libtensorrt_llm.so
/usr/bin/ld: /home/jieye/viper2/TensorRT-LLM/cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a(kvCacheManager.cpp.o): in function tensorrt_llm::batch_manager::kv_cache_manager::KVCacheManager::getMaxNumTokens(tensorrt_llm::batch_manager::kv_cache_manager::KvCacheConfig const&, nvinfer1::DataType, tensorrt_llm::runtime::GptModelConfig const&, tensorrt_llm::runtime::WorldConfig const&, tensorrt_llm::runtime::BufferManager const&)': kvCacheManager.cpp:(.text+0x213e): undefined reference to
ompi_mpi_comm_world'
collect2: error: ld returned 1 exit status
gmake[3]: *** [tensorrt_llm/CMakeFiles/tensorrt_llm.dir/build.make:1221: tensorrt_llm/libtensorrt_llm.so] Error 1
gmake[2]: *** [CMakeFiles/Makefile2:743: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/all] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:750: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2
gmake: *** [Makefile:179: tensorrt_llm] Error 2
Traceback (most recent call last):
File "/home/jieye/viper2/TensorRT-LLM/scripts/build_wheel.py", line 312, in
I got the same error without container
I got the same error without container
I solve this problem by using cuda12.2 and openmpi.
I got the same error without container
I solve this problem by using cuda12.2 and openmpi.
Glad your issue is solved. Pls try use container. Since the dependency in container is well tested.
openmpi
what do you mean "openmpi"?
openmpi
what do you mean "openmpi"?
I mean that I installed openmpi library instead of using mpich library. TensorRT-LLM's KVCache Manager rely on mpi for some communication, the given compiled static library (libtensorrt_llm_batch_manager_static.pre_cxx11.a) uses openmpi.