server icon indicating copy to clipboard operation
server copied to clipboard

Triton Tensorrt-LLM 24.04 and 24.05 are very large

Open yaysummeriscoming opened this issue 1 year ago • 3 comments

Description

Looking at the catalog, Triton containers 24.04 and 24.05 are very large, both coming in at ~18.5GB. 24.01-3 were ~8.5GB. Why is this? Is it something to do with the containers being built on previous versions of the Triton base container?

yaysummeriscoming avatar Jun 08 '24 12:06 yaysummeriscoming

We are aware of the current concerns with container size. The team is working on this with priority

statiraju avatar Jun 11 '24 17:06 statiraju

@nvda-mesharma for viz

statiraju avatar Jun 11 '24 17:06 statiraju

We are aware of the current concerns with container size. The team is working on this with priority

Great to hear, thanks!

yaysummeriscoming avatar Jun 12 '24 13:06 yaysummeriscoming

Hi @yaysummeriscoming, the TRT-LLM container has different dependency stack. Some packages like pytorch are required during runtime so the image size is larger than the other Triton images. We are improving the container size along with the releases. Closing this issue as we have internal tracker for the TRT-LLM image size. Thanks for bringing this up!

krishung5 avatar Aug 26 '24 22:08 krishung5

@krishung5 ok thanks, removing the dependency on Pytorch would be great too

yaysummeriscoming avatar Aug 27 '24 15:08 yaysummeriscoming