TensorRT-LLM Build TensorRT-LLM occur error without container

Does Tensorrt-LLM have to use cuda12.2? Can I build it with cuda12.0 without the container? Here is my environment information:

Cuda version: 12.0
TensorRT version: 9.2.0.5
Cudnn version: 8.9.7.29
NCCL version: 2.18.3-1

Running command "python3 scripts/build_wheel.py --clean --trt_root /home/jieye/viper2/libraries/TensorRT-9.2.0.5.Linux.x86_64-gnu.cuda-12.2.cudnn8.9 --nccl_root /soft/libraries/nccl/nccl_2.18.3-1+cuda12.2_x86_64 --cudnn_root /home/jieye/viper2/libraries/cudnn-12-linux-x64-v8.9.7.29" reports the following errors:

ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas fatal : Ptx assembly aborted due to errors gmake[3]: *** [tensorrt_llm/common/CMakeFiles/common_src.dir/build.make:217: tensorrt_llm/common/CMakeFiles/common_src.dir/cudaFp8Utils.cu.o] Error 255 gmake[3]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingCommon.cu.o [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingKernels.cu.o gmake[2]: *** [CMakeFiles/Makefile2:780: tensorrt_llm/common/CMakeFiles/common_src.dir/all] Error 2 gmake[2]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/gptKernels.cu.o

Feb 14 '24 04:02 Jye-525

BTW, Does Tensorrt-LLM have to use openmpi? Can I build it with mpich or cray-mpich? When I build TensorRT-LLM without using openmpi, it always report the following error:

[100%] Built target tensorrt_llm_static [100%] Linking CXX shared library libtensorrt_llm.so /usr/bin/ld: /home/jieye/viper2/TensorRT-LLM/cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a(kvCacheManager.cpp.o): in function tensorrt_llm::batch_manager::kv_cache_manager::KVCacheManager::getMaxNumTokens(tensorrt_llm::batch_manager::kv_cache_manager::KvCacheConfig const&, nvinfer1::DataType, tensorrt_llm::runtime::GptModelConfig const&, tensorrt_llm::runtime::WorldConfig const&, tensorrt_llm::runtime::BufferManager const&)': kvCacheManager.cpp:(.text+0x213e): undefined reference to ompi_mpi_comm_world' collect2: error: ld returned 1 exit status gmake[3]: *** [tensorrt_llm/CMakeFiles/tensorrt_llm.dir/build.make:1221: tensorrt_llm/libtensorrt_llm.so] Error 1 gmake[2]: *** [CMakeFiles/Makefile2:743: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/all] Error 2 gmake[1]: *** [CMakeFiles/Makefile2:750: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2 gmake: *** [Makefile:179: tensorrt_llm] Error 2 Traceback (most recent call last): File "/home/jieye/viper2/TensorRT-LLM/scripts/build_wheel.py", line 312, in main(**vars(args)) File "/home/jieye/viper2/TensorRT-LLM/scripts/build_wheel.py", line 168, in main build_run( File "/home/jieye/.conda/envs/dspeed_env/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 64 --target tensorrt_llm tensorrt_llm_static nvinfer_plugin_tensorrt_llm th_common bindings benchmarks' returned non-zero exit status 2.

Feb 14 '24 20:02 Jye-525

I got the same error without container

Mar 01 '24 05:03 Hukongtao

I got the same error without container

I solve this problem by using cuda12.2 and openmpi.

Mar 01 '24 16:03 Jye-525

I got the same error without container

I solve this problem by using cuda12.2 and openmpi.

Glad your issue is solved. Pls try use container. Since the dependency in container is well tested.

Mar 22 '24 14:03 litaotju

openmpi

what do you mean "openmpi"？

Mar 25 '24 03:03 Hukongtao

openmpi

what do you mean "openmpi"？

I mean that I installed openmpi library instead of using mpich library. TensorRT-LLM's KVCache Manager rely on mpi for some communication, the given compiled static library (libtensorrt_llm_batch_manager_static.pre_cxx11.a) uses openmpi.

Mar 25 '24 03:03 Jye-525

TensorRT-LLM TensorRT-LLM copied to clipboard

Build TensorRT-LLM occur error without container

TensorRT-LLM
TensorRT-LLM copied to clipboard