server
server copied to clipboard
Triton Tensorrt-LLM 24.04 and 24.05 are very large
Description
Looking at the catalog, Triton containers 24.04 and 24.05 are very large, both coming in at ~18.5GB. 24.01-3 were ~8.5GB. Why is this? Is it something to do with the containers being built on previous versions of the Triton base container?
We are aware of the current concerns with container size. The team is working on this with priority
@nvda-mesharma for viz
We are aware of the current concerns with container size. The team is working on this with priority
Great to hear, thanks!
Hi @yaysummeriscoming, the TRT-LLM container has different dependency stack. Some packages like pytorch are required during runtime so the image size is larger than the other Triton images. We are improving the container size along with the releases. Closing this issue as we have internal tracker for the TRT-LLM image size. Thanks for bringing this up!
@krishung5 ok thanks, removing the dependency on Pytorch would be great too