seldon-core
seldon-core copied to clipboard
model version endpoint is not accessible in Triton prepackaged servers
Describe the bug
Triton version policy is not implemented for access through Seldon core in Seldon prepackaged servers. v2 protocal has this endpoint with POST v2/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]/infer, however, it seems these endpoints are not accessible in Seldon Triton prepackaged servers.
For Seldon Triton server:
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: image-models
spec:
name: default
predictors:
- graph:
implementation: TRITON_SERVER
logger:
mode: all
modelUri: s3://minio-seldon/models
envSecretRefName: seldon-init-container-secret
name: image-models
type: MODEL
name: default
replicas: 1
protocol: kfserving
and the set of uploaded models with multi-versions:
I0730 17:03:13.463289 1 server.cc:586]
+----------+---------+--------+
| Model | Version | Status |
+----------+---------+--------+
| resnet | 1 | READY |
| resnet | 2 | READY |
| resnet | 3 | READY |
| xception | 1 | READY |
| xception | 2 | READY |
| xception | 3 | READY |
| xception | 4 | READY |
+----------+---------+--------+
only the last version of each models resnet (version 3)and exception (version 4) are accessible:
E.g.
for
import json
import requests
import numpy as np
URL = "http://localhost:32000/seldon/default/resnet"
def predict(data):
data = {
"inputs": [
{
"name": "input",
"data": data.tolist(),
"datatype": "FP32",
"shape": data.shape,
}
]
}
r = requests.post(f"{URL}/v2/models/xception/infer", json=data)
predictions = np.array(r.json()["outputs"][0]["data"]).reshape(
r.json()["outputs"][0]["shape"]
)
output = [np.argmax(x) for x in predictions]
return output
the values of the r.text is:
'{"model_name":"resnet","model_version":"3" ...
and:
'{"model_name":"xception","model_version":"4" ...
And explicit versioning like as per V2 protocal:
...
r = requests.post(f"{URL}/v2/models/versions/1/xception/infer", json=data)
...
will result in errors:
...
('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))"
To reproduce
use a multi-version deployment and access the endpoints as mentioned in the last section.
Expected behaviour
Model versions should be accessible for consistency with Triton and V2 protocal.
Environment
- Cloud Provider: Bare Metal
- Kubernetes Cluster Version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.0-2+59bbb3530b6769", GitCommit:"59bbb3530b6769e4935a05ac0e13c9910c79253e", GitTreeState:"clean", BuildDate:"2022-05-13T06:41:13Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
- Deployed Seldon System Images:
value: docker.io/seldonio/seldon-core-executor:1.14.0
image: docker.io/seldonio/seldon-core-operator:1.14.0
The version access has always been problematic in the Kubernetes scenario as it would be preferred to defined separate vesions via the Resources themselves rather than opaquely via models loaded in Triton as otherwise there are two competing version techniques. This would be made simpler by having a clear Model resource in Kubernetes which we are looking into for v2 APIs.
Please test in v2. We don't explicitly handle versions in same way as this is transparent to the k8s resource updates.