seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

Incompatibility of metadata access in multi-model Triton nodes

Open saeid93 opened this issue 3 years ago • 1 comments

Describe the bug

In prepackaged multi-model Triton servers the model metadata endpoints in V2 protocol only one of the deployed models endpoints is accessible under v2/models/${MODEL_NAME}. This is because the name field under predictors can only accept one of the deployed models' names. For example I have deployed two (similar models) in the Seldon packaged Triton server and both models are loaded and successfully available for inferring under /infer endpoint. But for metadata, it is possible to supply only one of their names (onnx-gpt2-model1 in the provided example) therefore, only one of their metadata is exposed.

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: gpt2-multi
spec:
  predictors:
  - graph:
      implementation: TRITON_SERVER
      logger:
        mode: all
      modelUri: s3://language-models-multi
      envSecretRefName: seldon-init-container-secret
      name: onnx-gpt2-model1
      type: MODEL
    name: default
    replicas: 1
  protocol: kfserving

For the two successfully deployed models on Triton sever:

I0730 15:48:08.086312 1 server.cc:586] 
+------------------+---------+--------+
| Model            | Version | Status |
+------------------+---------+--------+
| onnx-gpt2-model1 | 1       | READY  |
| onnx-gpt2-model2 | 1       | READY  |
+------------------+---------+--------+

The onnx-gpt2-model1 metadata is accessible:

url -s http://localhost:32000/seldon/default/gpt2-multi/v2/models/onnx-gpt2-model1
{"name":"onnx-gpt2-model1","versions":["1"],"platform":"onnxruntime_onnx","inputs":[{"name":"input_ids","datatype":"INT32","shape":[-1,-1]},{"name":"attention_mask","datatype":"INT32","shape":[-1,-1]}],"outputs":[{"name":"past_key_values","datatype":"FP32","shape":[12,2,-1,12,-1,64]},{"name":"logits","datatype":"FP32","shape":[-1,-1,50257]}]}%

But the onnx-gpt2-model2 metadata is not:

curl -s http://localhost:32000/seldon/default/gpt2-multi/v2/models/onnx-gpt2-model2
{"status":{"code":500,"info":"Failed to find model onnx-gpt2-model2","status":"FAILURE"}}% 

I think the name field should accept a list of names (of all models) instead of a single name or another option would be to generate the names automatically from the Triton servers.

To reproduce

Following the Pretrained GPT2 Model Deployment Example, just generated two similar models instead of one.

Expected behaviour

All models' metadata endpoints should be available.

Environment

  • Cloud Provider: Bare Metal
  • Kubernetes Cluster Version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.0-2+59bbb3530b6769", GitCommit:"59bbb3530b6769e4935a05ac0e13c9910c79253e", GitTreeState:"clean", BuildDate:"2022-05-13T06:41:13Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
  • Deployed Seldon System Images:
value: docker.io/seldonio/seldon-core-executor:1.14.0
image: docker.io/seldonio/seldon-core-operator:1.14.0

saeid93 avatar Jul 30 '22 16:07 saeid93

I think this is related to #4240 and would be solved by clearer Model semantics which we are investigating.

ukclivecox avatar Aug 03 '22 06:08 ukclivecox

Please test in V2

ukclivecox avatar Dec 05 '22 11:12 ukclivecox