node-feature-discovery
node-feature-discovery copied to clipboard
Make readiness and liveness probes configurable
What would you like to be added: I'd like for the readiness and liveness probes to be configurable in the helm values.
Why is this needed: node-feature-discovery-master pods use gRPC probes which are alpha feature gated in k8s v1.23, beta in v1.24, and GA in 1.27. This becomes problematic when trying to deploy node-feature-discovery on older versions of kubernetes.
On k8s v1.23 without the gRPC feature gate enabled, the node-feature-discovery-master pods never appear ready. Here are the events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned nvidia-gpu-operator/nvidia-gpu-operator-node-feature-discovery-master-7c8c9856svd9x to dev-worker-cpu-0
Normal Pulled 25m kubelet Container image "registry.k8s.io/nfd/node-feature-discovery:v0.15.4" already present on machine
Normal Created 25m kubelet Created container master
Normal Started 25m kubelet Started container master
Warning Unhealthy 19m (x31 over 24m) kubelet Liveness probe errored: missing probe handler for nvidia-gpu-operator-node-feature-discovery-master-7c8c9856svd9x_nvidia-gpu-operator(d216ddfc-d0a7-4bb5-950f-38248bfc8d17):master
Warning Unhealthy 4m50s (x136 over 24m) kubelet Readiness probe errored: missing probe handler for nvidia-gpu-operator-node-feature-discovery-master-7c8c9856svd9x_nvidia-gpu-operator(d216ddfc-d0a7-4bb5-950f-38248bfc8d17):master
If the the probes were configurable I could point them at the HTTP Prometheus server running on port 8081 or just null them out:
readinessProbe:
httpGet:
path: /metrics
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3
successThreshold: 1
livenessProbe:
httpGet:
path: /metrics
port: 8081
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
failureThreshold: 3