Open
sgaist
opened this issue 1 year ago
•
5 comments
trafficstars
Description
This PR ensures that a default value for the KServe protocol is passed forward to the ModelServer class and that the class itself uses a default value if instantiated directly.
With the current implementation, if the PROTOCOL_VERSION environment variable is not set, the protocol property of the ModelServer instance will be as well and this will trigger an exception when on data return after processing as the base class does a check on its value.
Fixes #2956
Type of change
Please delete options that are not relevant.
[X] Bug fix (non-breaking change which fixes an issue)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] New feature (non-breaking change which adds functionality)
[ ] This change requires a documentation update
Feature/Issue validation/testing
A new test has been added to ensure that:
If the PROTOCOL_VERSION environment variable is not set, a default is used
If the PROTOCOL_VERSION environment variable is set to a valid value, it's indeed that one that is used
If the PROTOCOL_VERSION environment variable is set to an invalid value, the wrapper fails to start
Checklist:
[X] Did you have fun?
[X] Have you added tests that prove your fix is effective or that this feature works?
[X] Has code been commented, particularly in hard-to-understand areas?
[X] Have you made corresponding changes to the documentation?
I currently don't know why the changed file linting failed as I installed the pre-commit hooks and just to be sure I manually ran them on the files I changed and they where all deemed clean.
@agunapal I just realized that in order for the test to run properly, kserve must be installed and based on my experience the kserve[storage] variant. While adding it to developer.txt, I got hit by an incompatible requirement: serve is using 4.25.1 while kserve<4.0.0.
What would you recommend to go forward ? Isolate the test and create a dedicate flow for it ?
@agunapal I just realized that in order for the test to run properly, kserve must be installed and based on my experience the kserve[storage] variant. While adding it to developer.txt, I got hit by an incompatible requirement: serve is using 4.25.1 while kserve<4.0.0.
What would you recommend to go forward ? Isolate the test and create a dedicate flow for it ?
Yes, I'm not sure if we want to do this.
Can you please check if you can add a test here https://github.com/pytorch/serve/blob/master/kubernetes/kserve/tests/scripts/test_mnist.sh
After having fiddled a bit more with test_mnist.sh, I can suggest three solutions:
Set the kserve version to v0.11.0 (which is the last one before they implemented systematic definition of the PROTOCOL_VERSION environment variable) in kserve_cpu_tests.yml.
Modify test_mnist.sh to deploy kserve v0.11.0 after the current set of tests and re-do a run of mnist_v1_cpu
Make a matrix of the KServe versions to test against.
Number one has the downside of using an outdated version of KServe.
Number two makes the tests last a bit longer but does not preclude from updating the KServe version used for all other tests.
Number three will multiply the test duration by the amount of KServe version tested but allows to confirm that everything works properly on them. This also means that test_mnist.sh does not need to be modified.
The options are listed in ascending order in terms of resource/time impact.
After having fiddled a bit more with test_mnist.sh, I can suggest three solutions:
Set the kserve version to v0.11.0 (which is the last one before they implemented systematic definition of the PROTOCOL_VERSION environment variable) in kserve_cpu_tests.yml.
Modify test_mnist.sh to deploy kserve v0.11.0 after the current set of tests and re-do a run of mnist_v1_cpu
Make a matrix of the KServe versions to test against.
Number one has the downside of using an outdated version of KServe. Number two makes the tests last a bit longer but does not preclude from updating the KServe version used for all other tests. Number three will multiply the test duration by the amount of KServe version tested but allows to confirm that everything works properly on them. This also means that test_mnist.sh does not need to be modified.
The options are listed in ascending order in terms of resource/time impact.
Which one would you like to have ?
Thanks for looking into this.
If this is already implemented in the latest version, I think you can manually run a test with 0.11.0 version of kserve and attach the logs.
FYI, we are making a release soon. So, we will review this after the release.
Also curious, if this is resolved in the latest version of kserve, can't you use that?
KServe 0.11.0 run for protocol v2 prior to this PR
Check status of the pod
NAME READY STATUS RESTARTS AGE
torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb 1/2 Running 0 2m
Name: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb
Namespace: default
Priority: 0
Service Account: default
Node: minikube/192.168.49.2
Start Time: Thu, 29 Feb 2024 10:26:12 +0100
Labels: app=torchserve-mnist-v2-predictor-00001
component=predictor
pod-template-hash=9569d6f8b
service.istio.io/canonical-name=torchserve-mnist-v2-predictor
service.istio.io/canonical-revision=torchserve-mnist-v2-predictor-00001
serviceEnvelope=kservev2
serving.knative.dev/configuration=torchserve-mnist-v2-predictor
serving.knative.dev/configurationGeneration=1
serving.knative.dev/configurationUID=a40d926b-2a09-4d00-9aa5-34e09fed7608
serving.knative.dev/revision=torchserve-mnist-v2-predictor-00001
serving.knative.dev/revisionUID=cdff5033-616f-454c-a27c-03984823cd74
serving.knative.dev/service=torchserve-mnist-v2-predictor
serving.knative.dev/serviceUID=f04ce05a-78a7-44ed-8473-3cf46fb117b6
serving.kserve.io/inferenceservice=torchserve-mnist-v2
Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev
autoscaling.knative.dev/min-scale: 1
internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v2
prometheus.kserve.io/path: /metrics
prometheus.kserve.io/port: 8082
serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager
serving.kserve.io/enable-metric-aggregation: false
serving.kserve.io/enable-prometheus-scraping: false
Status: Running
IP: 10.244.0.17
IPs:
IP: 10.244.0.17
Controlled By: ReplicaSet/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b
Init Containers:
storage-initializer:
Container ID: docker://6f8e446c5344730eea59403bf3d9b3307fe96c3d8df8c6cdcd63773a852aec84
Image: kserve/storage-initializer:v0.11.0
Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed
Port:
Host Port:
Args:
gs://kfserving-examples/models/torchserve/image_classifier/v2
/mnt/models
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 29 Feb 2024 10:26:39 +0100
Finished: Thu, 29 Feb 2024 10:26:56 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 100m
memory: 100Mi
Environment:
Mounts:
/mnt/models from kserve-provision-location (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro)
Containers:
kserve-container:
Container ID: docker://cc99e7eecaf2c842fcea9cc30fb301e6427a4cfa259f7b42cb0806e6a92ea1e0
Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2
Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2
Port: 8080/TCP
Host Port: 0/TCP
Args:
torchserve
--start
--model-store=/mnt/models/model-store
--ts-config=/mnt/models/config/config.properties
State: Running
Started: Thu, 29 Feb 2024 10:28:02 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 100m
memory: 256Mi
Environment:
TS_SERVICE_ENVELOPE: kservev2
PORT: 8080
K_REVISION: torchserve-mnist-v2-predictor-00001
K_CONFIGURATION: torchserve-mnist-v2-predictor
K_SERVICE: torchserve-mnist-v2-predictor
Mounts:
/mnt/models from kserve-provision-location (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro)
queue-proxy:
Container ID: docker://ad160b0a42f80dc14996c9acf54185125788bca57ec79eb36ca9193823a37521
Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f
Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f
Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Thu, 29 Feb 2024 10:28:07 +0100
Ready: False
Restart Count: 0
Requests:
cpu: 25m
Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
SERVING_NAMESPACE: default
SERVING_SERVICE: torchserve-mnist-v2-predictor
SERVING_CONFIGURATION: torchserve-mnist-v2-predictor
SERVING_REVISION: torchserve-mnist-v2-predictor-00001
QUEUE_SERVING_PORT: 8012
QUEUE_SERVING_TLS_PORT: 8112
CONTAINER_CONCURRENCY: 0
REVISION_TIMEOUT_SECONDS: 300
REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0
REVISION_IDLE_TIMEOUT_SECONDS: 0
SERVING_POD: torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb (v1:metadata.name)
SERVING_POD_IP: (v1:status.podIP)
SERVING_LOGGING_CONFIG:
SERVING_LOGGING_LEVEL:
SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"}
SERVING_ENABLE_REQUEST_LOG: false
SERVING_REQUEST_METRICS_BACKEND: prometheus
TRACING_CONFIG_BACKEND: none
TRACING_CONFIG_ZIPKIN_ENDPOINT:
TRACING_CONFIG_DEBUG: false
TRACING_CONFIG_SAMPLE_RATE: 0.1
USER_PORT: 8080
SYSTEM_NAMESPACE: knative-serving
METRICS_DOMAIN: knative.dev/internal/serving
SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1}
ENABLE_PROFILING: false
SERVING_ENABLE_PROBE_REQUEST_LOG: false
METRICS_COLLECTOR_ADDRESS:
HOST_IP: (v1:status.hostIP)
ENABLE_HTTP2_AUTO_DETECTION: false
ROOT_CA:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8wbm2 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-8wbm2:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
kserve-provision-location:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned default/torchserve-mnist-v2-predictor-00001-deployment-9569d6f8b-8w6pb to minikube
Normal Pulling 117s kubelet Pulling image "kserve/storage-initializer:v0.11.0"
Normal Pulled 100s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.382s (17.382s including waiting)
Normal Created 93s kubelet Created container storage-initializer
Normal Started 93s kubelet Started container storage-initializer
Normal Pulling 76s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2"
Normal Pulled 13s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.971s (1m2.971s including waiting)
Normal Created 11s kubelet Created container kserve-container
Normal Started 10s kubelet Started container kserve-container
Normal Pulling 10s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f"
Normal Pulled 6s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.128s (4.128s including waiting)
Normal Created 5s kubelet Created container queue-proxy
Normal Started 5s kubelet Started container queue-proxy
Warning Unhealthy 2s (x2 over 4s) kubelet Readiness probe failed: Get "http://10.244.0.17:8012/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Wait for inference service to be ready
Wait for ports to be in forwarding
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Make inference request
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Handling connection for 8080
100 6754 100 62 100 6692 98 10654 --:--:-- --:--:-- --:--:-- 10737
✘ Test failed: Prediction: {"error":"ValueError:NoneisnotavalidPredictorProtocol"}, expected {"model_name":"mnist","model_version":"1.0","id":"d3b15cad-50a2-4eaf-80ce-8b0a428bd298","parameters":null,"outputs":[{"name":"input-0","shape":[1],"datatype":"INT64","parameters":null,"data":[1]}]}.
Delete cluster
KServe 0.11.0 run for default protocol prior to this PR
MNIST KServe V1 test begin
Deploying the cluster
inferenceservice.serving.kserve.io/torchserve created
Waiting for pod to come up...
Check status of the pod
NAME READY STATUS RESTARTS AGE
torchserve-predictor-00001-deployment-c8b8c4595-mxtpf 1/2 Running 0 115s
Name: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf
Namespace: default
Priority: 0
Service Account: default
Node: minikube/192.168.49.2
Start Time: Thu, 29 Feb 2024 09:35:17 +0100
Labels: app=torchserve-predictor-00001
component=predictor
pod-template-hash=c8b8c4595
service.istio.io/canonical-name=torchserve-predictor
service.istio.io/canonical-revision=torchserve-predictor-00001
serviceEnvelope=kserve
serving.knative.dev/configuration=torchserve-predictor
serving.knative.dev/configurationGeneration=1
serving.knative.dev/configurationUID=36ca148d-50d5-402e-9c78-3becaa325179
serving.knative.dev/revision=torchserve-predictor-00001
serving.knative.dev/revisionUID=a1de2f03-1a4f-4620-a18c-e96e568709b8
serving.knative.dev/service=torchserve-predictor
serving.knative.dev/serviceUID=aeb06d26-b27d-42fc-8167-46c1521410ea
serving.kserve.io/inferenceservice=torchserve
Annotations: autoscaling.knative.dev/class: kpa.autoscaling.knative.dev
autoscaling.knative.dev/min-scale: 1
internal.serving.kserve.io/storage-initializer-sourceuri: gs://kfserving-examples/models/torchserve/image_classifier/v1
prometheus.kserve.io/path: /metrics
prometheus.kserve.io/port: 8082
serving.knative.dev/creator: system:serviceaccount:kserve:kserve-controller-manager
serving.kserve.io/enable-metric-aggregation: false
serving.kserve.io/enable-prometheus-scraping: false
Status: Running
IP: 10.244.0.17
IPs:
IP: 10.244.0.17
Controlled By: ReplicaSet/torchserve-predictor-00001-deployment-c8b8c4595
Init Containers:
storage-initializer:
Container ID: docker://98f5c07ad65f6095f52916559fa118f45c64ffb342637c2caac5b55aa8e718a6
Image: kserve/storage-initializer:v0.11.0
Image ID: docker-pullable://kserve/storage-initializer@sha256:962682077dc30c21113822f50b63400d759ce6c49518d0c9caa638b6f77c7fed
Port:
Host Port:
Args:
gs://kfserving-examples/models/torchserve/image_classifier/v1
/mnt/models
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 29 Feb 2024 09:35:43 +0100
Finished: Thu, 29 Feb 2024 09:35:59 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 100m
memory: 100Mi
Environment:
Mounts:
/mnt/models from kserve-provision-location (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro)
Containers:
kserve-container:
Container ID: docker://9f5a3a87a2cb3d393e24806e178f7fbd66cdc74bf99875334336ca8897d6cac7
Image: index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2
Image ID: docker-pullable://pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2
Port: 8080/TCP
Host Port: 0/TCP
Args:
torchserve
--start
--model-store=/mnt/models/model-store
--ts-config=/mnt/models/config/config.properties
State: Running
Started: Thu, 29 Feb 2024 09:37:05 +0100
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1Gi
Requests:
cpu: 100m
memory: 256Mi
Environment:
TS_SERVICE_ENVELOPE: kserve
PORT: 8080
K_REVISION: torchserve-predictor-00001
K_CONFIGURATION: torchserve-predictor
K_SERVICE: torchserve-predictor
Mounts:
/mnt/models from kserve-provision-location (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro)
queue-proxy:
Container ID: docker://eb7c37a231243d0e19973d392aa8a9425429eb10aea1c8a415353768cae214ad
Image: gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f
Image ID: docker-pullable://gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f
Ports: 8022/TCP, 9090/TCP, 9091/TCP, 8012/TCP, 8112/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Thu, 29 Feb 2024 09:37:11 +0100
Ready: False
Restart Count: 0
Requests:
cpu: 25m
Readiness: http-get http://:8012/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
SERVING_NAMESPACE: default
SERVING_SERVICE: torchserve-predictor
SERVING_CONFIGURATION: torchserve-predictor
SERVING_REVISION: torchserve-predictor-00001
QUEUE_SERVING_PORT: 8012
QUEUE_SERVING_TLS_PORT: 8112
CONTAINER_CONCURRENCY: 0
REVISION_TIMEOUT_SECONDS: 300
REVISION_RESPONSE_START_TIMEOUT_SECONDS: 0
REVISION_IDLE_TIMEOUT_SECONDS: 0
SERVING_POD: torchserve-predictor-00001-deployment-c8b8c4595-mxtpf (v1:metadata.name)
SERVING_POD_IP: (v1:status.podIP)
SERVING_LOGGING_CONFIG:
SERVING_LOGGING_LEVEL:
SERVING_REQUEST_LOG_TEMPLATE: {"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl": "{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}", "status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent": "{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp": "{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s", "protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"}
SERVING_ENABLE_REQUEST_LOG: false
SERVING_REQUEST_METRICS_BACKEND: prometheus
TRACING_CONFIG_BACKEND: none
TRACING_CONFIG_ZIPKIN_ENDPOINT:
TRACING_CONFIG_DEBUG: false
TRACING_CONFIG_SAMPLE_RATE: 0.1
USER_PORT: 8080
SYSTEM_NAMESPACE: knative-serving
METRICS_DOMAIN: knative.dev/internal/serving
SERVING_READINESS_PROBE: {"tcpSocket":{"port":8080,"host":"127.0.0.1"},"successThreshold":1}
ENABLE_PROFILING: false
SERVING_ENABLE_PROBE_REQUEST_LOG: false
METRICS_COLLECTOR_ADDRESS:
HOST_IP: (v1:status.hostIP)
ENABLE_HTTP2_AUTO_DETECTION: false
ROOT_CA:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pfjj8 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-pfjj8:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
kserve-provision-location:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit:
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 115s default-scheduler Successfully assigned default/torchserve-predictor-00001-deployment-c8b8c4595-mxtpf to minikube
Normal Pulling 113s kubelet Pulling image "kserve/storage-initializer:v0.11.0"
Normal Pulled 95s kubelet Successfully pulled image "kserve/storage-initializer:v0.11.0" in 17.862s (17.862s including waiting)
Normal Created 89s kubelet Created container storage-initializer
Normal Started 89s kubelet Started container storage-initializer
Normal Pulling 72s kubelet Pulling image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2"
Normal Pulled 9s kubelet Successfully pulled image "index.docker.io/pytorch/torchserve-kfs-nightly@sha256:5ec33cf3085b56b5eedb6b11c67faa5b7e669019573a26459eb49a2394531be2" in 1m2.861s (1m2.861s including waiting)
Normal Created 7s kubelet Created container kserve-container
Normal Started 7s kubelet Started container kserve-container
Normal Pulling 7s kubelet Pulling image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f"
Normal Pulled 2s kubelet Successfully pulled image "gcr.io/knative-releases/knative.dev/serving/cmd/queue@sha256:65c427aaab3be9cea1afea32cdef26d5855c69403077d2dc3439f75c26a1e83f" in 4.483s (4.483s including waiting)
Normal Created 1s kubelet Created container queue-proxy
Normal Started 1s kubelet Started container queue-proxy
Wait for inference service to be ready
Wait for ports to be in forwarding
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
Make inference request
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Handling connection for 8080
100 463 100 62 100 401 405 2621 --:--:-- --:--:-- --:--:-- 3026
✘ Test failed: Prediction: {"error":"ValueError:NoneisnotavalidPredictorProtocol"}, expected {"predictions":[2]}.
I did two runs to show that having the protocolVersion entry in the InferenceService definition does not change the failure.
While I can update the version of KServe, I think many people creating models to be used by torchserve do not handle the underlying infrastructure it will run on, so fixing the wrapper will have a broader impact.