server icon indicating copy to clipboard operation
server copied to clipboard

Relative path for s3 storage

Open Revist opened this issue 3 years ago • 4 comments

Description I have the following model repository structure:

model_repository
├── model1
│   ├── 1
│   │   ├── model.py
│   │   └── ...
│   └── config.pbtxt
├── model2
│   ├── 1
│   │   ├── model.py
│   │   └── ...
│   └── config.pbtxt
└── my_environment.tar.gz

to use my_environment.tar.gz in every config.pbtxt I have:

parameters: {
  key: "EXECUTION_ENV_PATH",
  value: {string_value: "$$TRITON_MODEL_DIRECTORY/../my_environment.tar.gz"}
}

This works when I have model_repository locally, however on s3 I get the error:

failed to load 'model' version 1: Internal: Failed to get the canonical path for /tmp/folderWE2Op7/../my_environment.tar.gz.

Triton Information What version of Triton are you using?

I use 22.03 docker image.

Is there an easy way to give relative paths on s3?

Revist avatar Jun 16 '22 12:06 Revist

I think this should work fine when using S3 or local model repository. I have filed a ticket to look into this (DLIS-3870).

Tabrizian avatar Jun 16 '22 20:06 Tabrizian

I think the issue is tricky for cloud storage because internally Triton downloads the model to local file system (which is why you saw /tmp/folderWE2Op7) and this will break relative locality as the files that are not inside the model directory will not be downloaded.

GuanLuo avatar Jun 16 '22 22:06 GuanLuo

I see, then the question is how to use one tar file for multiple models? These tars tend to be big.

Revist avatar Jun 17 '22 09:06 Revist

Is there any update on this issue? Having the ability to share conda environments in an S3 model_repository would be a huge benefit. It's not a scalable solution to have duplicate tars taking up space in every model that uses them

LMarino1 avatar Aug 17 '22 16:08 LMarino1

We twisted the code a bit so the above scenario will work. Please refer to the Python backend documentation for up-to-date limitation when using the "EXECUTION_ENV_PATH", as we progressively improve its handling.

kthui avatar Oct 15 '22 00:10 kthui

We twisted the code a bit so the above scenario will work. Please refer to the Python backend documentation for up-to-date limitation when using the "EXECUTION_ENV_PATH", as we progressively improve its handling.

Any update? I face the same issue. How can I share python package on s3 storage between different python models? @kthui

zhaozhiming37 avatar Oct 27 '22 09:10 zhaozhiming37

We twisted the code a bit so the above scenario will work. Please refer to the Python backend documentation for up-to-date limitation when using the "EXECUTION_ENV_PATH", as we progressively improve its handling.

Any update? I face the same issue. How can I share python package on s3 storage between different python models? @kthui

@zhaozhiming37 this should be fixed in 22.11 release of Triton container

jbkyang-nvi avatar Nov 07 '22 23:11 jbkyang-nvi

@kthui @jbkyang-nvi @Tabrizian This is still an issue with r22.11 on GCS. Please help.

achbogga avatar Dec 16 '22 03:12 achbogga

Cannot load relative pytorch weights file as well because of the path issue as shown below

1216 05:34:41.189214 1 pb_stub.cc:313] Failed to initialize Python stub: FileNotFoundError: [Errno 2] No such file or directory: '/models/iunuml3_loftr/loftr-384-pgf-v0.pt'
At:
  /tmp/python_env_X9Kpwj/0/lib/python3.8/site-packages/torch/serialization.py(251): __init__
  /tmp/python_env_X9Kpwj/0/lib/python3.8/site-packages/torch/serialization.py(270): _open_file_like
  /tmp/python_env_X9Kpwj/0/lib/python3.8/site-packages/torch/serialization.py(771): load
  /tmp/python_env_X9Kpwj/0/lib/python3.8/site-packages/iunuml3-0.0.1-py3.8-linux-x86_64.egg/iunuml3/utils/utils.py(35): load_model
  /tmp/folderPvjM0i/1/model.py(63): __init__
  /tmp/folderPvjM0i/1/model.py(128): initialize
I1216 05:34:42.444039 1 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: iunuml3_line_0 (GPU device 0)
E1216 05:34:42.558818 1 model_lifecycle.cc:597] failed to load 'iunuml3_loftr' version 1: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/models/iunuml3_loftr/loftr-384-pgf-v0.pt'

achbogga avatar Dec 16 '22 18:12 achbogga

This still seems to be a problem. Relative path does not work.

simonrmonk avatar Sep 21 '23 17:09 simonrmonk

^ @kthui any idea why this might be happening?

jbkyang-nvi avatar Sep 22 '23 23:09 jbkyang-nvi

Please see point 4 and 5 on the documentation regarding relative path limitations: https://github.com/triton-inference-server/python_backend#important-notes

If you need improvements on relative path handling, please open a feature request: https://github.com/triton-inference-server/server/issues/new/choose

kthui avatar Sep 23 '23 00:09 kthui