server
server copied to clipboard
The trt llm container does not have the other backends
I want to run a version of triton that has both the TensorRT LLM backend and the other backends (TensorRT, ONNX, TorchScript) is it possible ?
Container version: 24.05
Hi @MatthieuToulemont, the Triton TRT-LLM container is a special container that only contains TRT-LLM backend and Python backend. If you'd like to have other backends, you could try with either way
- copy over the backends in the regular Triton container from
/opt/tritonserver/backends/* - build the container following the steps here.
Ok, is the python backend from the trt-llm different from the one of 24.05-py3 ?
No, the Python Backend should be the same.
No, the Python Backend should be the same.
Does the 24.05-py3 contain backends of ONNX、TensorRT and TorchScript?
@tricky61 The nvcr.io/nvidia/tritonserver:24.05-py3 container contains ONNX, TRT and PyTorch backends. The nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3 only has TRTLLM and Python backends.
@tricky61 The
nvcr.io/nvidia/tritonserver:24.05-py3container contains ONNX, TRT and PyTorch backends. Thenvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3only has TRTLLM and Python backends.
ok. I am using nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 I also use tritonserver:23.11 and add vllm backend manually. I will try to add vllm backends to nvcr.io/nvidia/tritonserver:24.05-py3 manually. Does this method make difference?because the nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3 is 11.2 GB and nvcr.io/nvidia/tritonserver:24.05-py3 costs 7.55GB
@tricky61 It shouldn't make any difference. Note that you'd have to pip install vllm and make sure model.py exists under /opt/tritonserver/backends/vllm_backend.
Closing due to in-activity.