seldon-core
seldon-core copied to clipboard
seldon deployment doesn't wait for container to be ready
i have a seldon deployment which includes a couple of containers in the same pod, configuration for one container looks similar to below, the container is of type COMBINER
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
spec:
name: my-app
predictors:
- componentSpecs:
- spec:
containers:
- image: my-app-image:version
name: combiner
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 60
periodSeconds: 5
successThreshold: 1
httpGet:
path: /health/status
port: http
scheme: HTTP
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 20
periodSeconds: 5
successThreshold: 1
httpGet:
path: /health/status
port: http
scheme: HTTP
timeoutSeconds: 1
and the controller manager has executor.fullHealthChecks set to true
Describe the bug
The above combiner has some errors and failed to start, stuck in crashloopback, however, the seldon deployment is healthy despite the container crashing (and MaxUnavailable is 0 on the deployment)
even when specifically implemented the health_status method to throw errors, the deployment still succeed despite of the crashing container
def health_status(self):
raise Exception("Health check failed due to some condition.")
Expected behaviour
during a new deployment, a failing container should fail the deployment
Environment
kubectl server version 1.26.7
seldon controller version 1.17.0