server icon indicating copy to clipboard operation
server copied to clipboard

The trt llm container does not have the other backends

Open MatthieuToulemont opened this issue 1 year ago • 7 comments

I want to run a version of triton that has both the TensorRT LLM backend and the other backends (TensorRT, ONNX, TorchScript) is it possible ?

Container version: 24.05

MatthieuToulemont avatar Jun 12 '24 09:06 MatthieuToulemont

Hi @MatthieuToulemont, the Triton TRT-LLM container is a special container that only contains TRT-LLM backend and Python backend. If you'd like to have other backends, you could try with either way

  • copy over the backends in the regular Triton container from /opt/tritonserver/backends/*
  • build the container following the steps here.

krishung5 avatar Jun 12 '24 19:06 krishung5

Ok, is the python backend from the trt-llm different from the one of 24.05-py3 ?

MatthieuToulemont avatar Jun 13 '24 08:06 MatthieuToulemont

No, the Python Backend should be the same.

krishung5 avatar Jun 13 '24 18:06 krishung5

No, the Python Backend should be the same.

Does the 24.05-py3 contain backends of ONNX、TensorRT and TorchScript?

tricky61 avatar Jun 18 '24 09:06 tricky61

@tricky61 The nvcr.io/nvidia/tritonserver:24.05-py3 container contains ONNX, TRT and PyTorch backends. The nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3 only has TRTLLM and Python backends.

krishung5 avatar Jun 18 '24 17:06 krishung5

@tricky61 The nvcr.io/nvidia/tritonserver:24.05-py3 container contains ONNX, TRT and PyTorch backends. The nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3 only has TRTLLM and Python backends.

ok. I am using nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 I also use tritonserver:23.11 and add vllm backend manually. I will try to add vllm backends to nvcr.io/nvidia/tritonserver:24.05-py3 manually. Does this method make difference?because the nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 is 11.2 GB and nvcr.io/nvidia/tritonserver:24.05-py3 costs 7.55GB

tricky61 avatar Jun 19 '24 01:06 tricky61

@tricky61 It shouldn't make any difference. Note that you'd have to pip install vllm and make sure model.py exists under /opt/tritonserver/backends/vllm_backend.

krishung5 avatar Jun 26 '24 19:06 krishung5

Closing due to in-activity.

Tabrizian avatar Sep 06 '24 14:09 Tabrizian