server icon indicating copy to clipboard operation
server copied to clipboard

Allow for explicit folder name when specifying where remote model repository will be downloaded.

Open mcthill opened this issue 1 year ago • 12 comments

Is your feature request related to a problem? Please describe. When a model repository is downloaded from a remote location there are possible references to these files that are needed to be explicitly provided. Currently the remote repository will use the TRITON_AWS_MOUNT_DIRECTORY (AWS implementation) and download the files to the location you specify, but it will then place those files in a randomly named folder with a convention similar to folderXXXXXX. Placing the files in a randomly named folder defeats the purpose of specifying an explicit directory name because it can no longer be referenced in your files.

Describe the solution you'd like Either remove the folderXXXXXX that is created and place the files exactly in the TRITON_AWS_MOUNT_DIRECTORY or provide another environment variable for the explicit folder name to override the folderXXXXXX naming convention.

Describe alternatives you've considered

  1. Using the Triton server as a base container where there is an initialization process added to download the remote repository to a specific location.
  2. Using an init container to load to explicit location

Additional context N/A

mcthill avatar Nov 29 '23 13:11 mcthill

Thanks for suggesting improvements. I have added a ticket for us to investigate further.

kthui avatar Nov 30 '23 00:11 kthui

Hi is there any update on this feature? This is quite useful for loading large LLM from s3.

shixianc avatar Jan 16 '24 03:01 shixianc

The tensorrt-llm backend requires setting the gpt_model_path. This can't be relative and fails with S3-based model repos. Any update on this @kthui?

danielchalef avatar Jan 21 '24 01:01 danielchalef

Apologies for the delay and thank you for following up. I flagged the ticket so that we can look at prioritizing it.

dyastremsky avatar Feb 20 '24 19:02 dyastremsky

@dyastremsky another interested customer here, any updates about prioritizing this feature?

jadhosn avatar Mar 12 '24 19:03 jadhosn

Thanks for checking in. Checked with folks about prioritization.

dyastremsky avatar Mar 12 '24 19:03 dyastremsky

Hi Do you have an update on this feature to share with us?

Alexis-Jacob avatar May 22 '24 16:05 Alexis-Jacob

There's some work being done that this feature depends on. Work on it is resuming.

Once that is merged, this feature can be worked on.

dyastremsky avatar May 22 '24 17:05 dyastremsky