seldon-core icon indicating copy to clipboard operation
seldon-core copied to clipboard

model version endpoint is not accessible in Triton prepackaged servers

Open saeid93 opened this issue 3 years ago • 1 comments

Describe the bug

Triton version policy is not implemented for access through Seldon core in Seldon prepackaged servers. v2 protocal has this endpoint with POST v2/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]/infer, however, it seems these endpoints are not accessible in Seldon Triton prepackaged servers. For Seldon Triton server:

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: image-models
spec:
  name: default
  predictors:
  - graph:
      implementation: TRITON_SERVER
      logger:
        mode: all
      modelUri: s3://minio-seldon/models
      envSecretRefName: seldon-init-container-secret
      name: image-models
      type: MODEL
    name: default
    replicas: 1
  protocol: kfserving

and the set of uploaded models with multi-versions:

I0730 17:03:13.463289 1 server.cc:586] 
+----------+---------+--------+
| Model    | Version | Status |
+----------+---------+--------+
| resnet   | 1       | READY  |
| resnet   | 2       | READY  |
| resnet   | 3       | READY  |
| xception | 1       | READY  |
| xception | 2       | READY  |
| xception | 3       | READY  |
| xception | 4       | READY  |
+----------+---------+--------+

only the last version of each models resnet (version 3)and exception (version 4) are accessible: E.g. for

import json
import requests
import numpy as np
URL = "http://localhost:32000/seldon/default/resnet"
def predict(data):
    data = {
        "inputs": [
            {
                "name": "input",
                "data": data.tolist(),
                "datatype": "FP32",
                "shape": data.shape,
            }
        ]
    }

    r = requests.post(f"{URL}/v2/models/xception/infer", json=data)
    predictions = np.array(r.json()["outputs"][0]["data"]).reshape(
        r.json()["outputs"][0]["shape"]
    )
    output = [np.argmax(x) for x in predictions]
    return output

the values of the r.text is:

'{"model_name":"resnet","model_version":"3" ...

and:

'{"model_name":"xception","model_version":"4" ...

And explicit versioning like as per V2 protocal:

...
r = requests.post(f"{URL}/v2/models/versions/1/xception/infer", json=data)
...

will result in errors:

...
('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))"

To reproduce

use a multi-version deployment and access the endpoints as mentioned in the last section.

Expected behaviour

Model versions should be accessible for consistency with Triton and V2 protocal.

Environment

  • Cloud Provider: Bare Metal
  • Kubernetes Cluster Version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.0-2+59bbb3530b6769", GitCommit:"59bbb3530b6769e4935a05ac0e13c9910c79253e", GitTreeState:"clean", BuildDate:"2022-05-13T06:41:13Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
  • Deployed Seldon System Images:
value: docker.io/seldonio/seldon-core-executor:1.14.0
image: docker.io/seldonio/seldon-core-operator:1.14.0

saeid93 avatar Jul 30 '22 17:07 saeid93

The version access has always been problematic in the Kubernetes scenario as it would be preferred to defined separate vesions via the Resources themselves rather than opaquely via models loaded in Triton as otherwise there are two competing version techniques. This would be made simpler by having a clear Model resource in Kubernetes which we are looking into for v2 APIs.

ukclivecox avatar Aug 03 '22 06:08 ukclivecox

Please test in v2. We don't explicitly handle versions in same way as this is transparent to the k8s resource updates.

ukclivecox avatar Dec 05 '22 11:12 ukclivecox