TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Build TensorRT-LLM occur error without container

Open Jye-525 opened this issue 1 year ago • 1 comments

Does Tensorrt-LLM have to use cuda12.2? Can I build it with cuda12.0 without the container? Here is my environment information:

  • Cuda version: 12.0
  • TensorRT version: 9.2.0.5
  • Cudnn version: 8.9.7.29
  • NCCL version: 2.18.3-1

Running command "python3 scripts/build_wheel.py --clean --trt_root /home/jieye/viper2/libraries/TensorRT-9.2.0.5.Linux.x86_64-gnu.cuda-12.2.cudnn8.9 --nccl_root /soft/libraries/nccl/nccl_2.18.3-1+cuda12.2_x86_64 --cudnn_root /home/jieye/viper2/libraries/cudnn-12-linux-x64-v8.9.7.29" reports the following errors:

ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 620; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 1909; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 5395; error : Vector operand is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; warning : Vector Type not specified properly ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Vector qualifier is not allowed for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Operation .max requires .u32 or .s32 or .u64 or .s64 type for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas /var/tmp/pbs.1280863.polaris-pbs-01.hsn.cm.polaris.alcf.anl.gov/tmpxft_000025d7_00000000-6_cudaFp8Utils.compute_90.ptx, line 8963; error : Arguments mismatch for instruction 'atom' ptxas fatal : Ptx assembly aborted due to errors gmake[3]: *** [tensorrt_llm/common/CMakeFiles/common_src.dir/build.make:217: tensorrt_llm/common/CMakeFiles/common_src.dir/cudaFp8Utils.cu.o] Error 255 gmake[3]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingCommon.cu.o [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/decodingKernels.cu.o gmake[2]: *** [CMakeFiles/Makefile2:780: tensorrt_llm/common/CMakeFiles/common_src.dir/all] Error 2 gmake[2]: *** Waiting for unfinished jobs.... [ 94%] Building CUDA object tensorrt_llm/kernels/CMakeFiles/kernels_src.dir/gptKernels.cu.o

Jye-525 avatar Feb 14 '24 04:02 Jye-525

BTW, Does Tensorrt-LLM have to use openmpi? Can I build it with mpich or cray-mpich? When I build TensorRT-LLM without using openmpi, it always report the following error:

[100%] Built target tensorrt_llm_static [100%] Linking CXX shared library libtensorrt_llm.so /usr/bin/ld: /home/jieye/viper2/TensorRT-LLM/cpp/tensorrt_llm/batch_manager/x86_64-linux-gnu/libtensorrt_llm_batch_manager_static.pre_cxx11.a(kvCacheManager.cpp.o): in function tensorrt_llm::batch_manager::kv_cache_manager::KVCacheManager::getMaxNumTokens(tensorrt_llm::batch_manager::kv_cache_manager::KvCacheConfig const&, nvinfer1::DataType, tensorrt_llm::runtime::GptModelConfig const&, tensorrt_llm::runtime::WorldConfig const&, tensorrt_llm::runtime::BufferManager const&)': kvCacheManager.cpp:(.text+0x213e): undefined reference to ompi_mpi_comm_world' collect2: error: ld returned 1 exit status gmake[3]: *** [tensorrt_llm/CMakeFiles/tensorrt_llm.dir/build.make:1221: tensorrt_llm/libtensorrt_llm.so] Error 1 gmake[2]: *** [CMakeFiles/Makefile2:743: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/all] Error 2 gmake[1]: *** [CMakeFiles/Makefile2:750: tensorrt_llm/CMakeFiles/tensorrt_llm.dir/rule] Error 2 gmake: *** [Makefile:179: tensorrt_llm] Error 2 Traceback (most recent call last): File "/home/jieye/viper2/TensorRT-LLM/scripts/build_wheel.py", line 312, in main(**vars(args)) File "/home/jieye/viper2/TensorRT-LLM/scripts/build_wheel.py", line 168, in main build_run( File "/home/jieye/.conda/envs/dspeed_env/lib/python3.10/subprocess.py", line 526, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'cmake --build . --config Release --parallel 64 --target tensorrt_llm tensorrt_llm_static nvinfer_plugin_tensorrt_llm th_common bindings benchmarks' returned non-zero exit status 2.

Jye-525 avatar Feb 14 '24 20:02 Jye-525

I got the same error without container

Hukongtao avatar Mar 01 '24 05:03 Hukongtao

I got the same error without container

I solve this problem by using cuda12.2 and openmpi.

Jye-525 avatar Mar 01 '24 16:03 Jye-525

I got the same error without container

I solve this problem by using cuda12.2 and openmpi.

Glad your issue is solved. Pls try use container. Since the dependency in container is well tested.

litaotju avatar Mar 22 '24 14:03 litaotju

openmpi

what do you mean "openmpi"?

Hukongtao avatar Mar 25 '24 03:03 Hukongtao

openmpi

what do you mean "openmpi"?

I mean that I installed openmpi library instead of using mpich library. TensorRT-LLM's KVCache Manager rely on mpi for some communication, the given compiled static library (libtensorrt_llm_batch_manager_static.pre_cxx11.a) uses openmpi.

Jye-525 avatar Mar 25 '24 03:03 Jye-525