wshi5985
wshi5985
``` workflow-controller-869578f6c7-7jvhj 0/1 Running 0 3m27s workflow-controller-869578f6c7-sc9kx 1/1 Running 0 12m ``` as expected, one pod not ready. ``` Warning Unhealthy 24s kubelet Readiness probe failed: Get "http://10.127.212.119:9090/metrics": dial tcp...
plus it is tricky for update/deploy, since new pod could not come up while the old pod is leader. i had to delete the old replicaset to let the new...
@alexec we are not using those metrics right now. we noticed this when prometheus server shows errors.
i don't see other problems. there is a rolling update strategy, ``` rollingUpdate: maxSurge: 25% maxUnavailable: 25% ``` when apply the new deploy yaml, one new pod tried to come...
It looks like a pod could not be picked as leader if it is not ready just did another test, set rolling update strategy to 50%, with readinessProbe configured when...
btw, if we remove the workflow-controller-metrics service, what will be the impact (beside no metric exposed)? anything depends on the exposed metrics ?
during deployment, when readinessprobe failed, pod "ready" status showed "0/1"(instead of "1/1") and deployment stuck. i am not sure why. and our setup use prometheus-operator, which use serviceMonitor -> service...
with recreate strategy, it unblocks deploy. but with one pod always in notready status, it may trigger alert too. (we monitor unhealthy pods in production env) if there is no...
Thanks for the quick response and explain. just one more question. this change(Removing the non-leader metrics endpoint) started from which version ? Thanks.