TensorRT Question about container image

Hi,

I'm currently using the following container to serve a transformer model: nvcr.io/nvidia/tritonserver:23.12-py3

When i check the release notes for http://nvcr.io/nvidia/tritonserver:23.12-py3 - i see: TensorRT-LLM version release/0.7.0 here the link: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-23-12.html

I also want to serve a large language model - e.g. llama3 guard (https://huggingface.co/meta-llama/Meta-Llama-Guard-2-8B). Can i use the container http://nvcr.io/nvidia/tritonserver:23.12-py3 to serve a llama3 model?

How is the following container different from the one above? nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3

Thanks, Gerald

Jun 26 '24 14:06 geraldstanje

Hi @geraldstanje , could you post the question in https://github.com/triton-inference-server/server, we have triton experts there, thanks!

Aug 07 '24 05:08 ttyio

@geraldstanje, I will be closing this ticket due to our policy to close tickets with no activity for more than 21 days after a reply had been posted. Please reopen a new ticket if you still need help.

Sep 07 '24 01:09 moraxu