Add configurable probe timeout for activator to support high-latency environments
What version of Knative?
knative-v1.20.0-28-gff5c15ac5-dirty
Expected Behavior
Activator should be able to successfully probe queue-proxy health endpoints in environments with higher network latency.
Actual Behavior
Activator health check probes timeout with context deadline exceeded errors when network latency exceeds the hardcoded 300ms probe timeout. This is particularly problematic in service mesh environments where Envoy adds additional latency to the request path.
Steps to Reproduce the Problem
- Deploy Knative Serving with a service mesh (e.g., Istio) in a higher network latency system.
- Deploy a Knative service
- Observe activator logs showing probe timeout errors when network latency > 300ms
- Check
curlresponse times to queue-proxy/healthzendpoint - they exceed 300ms due to mesh overhead
Hi @bindrad ,
could you explain how your setup looks like if you have >300ms latency inside of you Kubernetes cluster? 300ms is basically a roundtrip to the other side of earth so it seems there are major issues inside of your cluster if this is actually the case - even with a service mesh in place.
Nevertheless I'm also not a big fan of hard coded defaults (even if 300ms is IMO plenty). I'll give your PR a look
/assign @bindrad