cilium
cilium copied to clipboard
daemon: Make cilium status independent from k8s status
This PR adds includeK8sCheck flag (default: false) to exclude the k8s probe from the result of health probe. This flag can be set by a newly added http header include-k8s-check of /healthz cilium API and pod's /healthz endpoint.
Fixes: https://github.com/cilium/cilium/issues/31702
@tkna I'm going to mark this for more discussion. While I see your issue, I'm not sure if disabling liveliness probes is ideal from Cilium's point of view.
I'm going to pull in some folks in your linked CFP for further discussion. Lets go over this in that issue.
I think this solution makes sense. We have discussed about this issue in the APAC community meeting and concluded that killing the whole Cilium agent on k8s disconnection is too aggressive. We might want to kill the certain subsystem, but maybe not entire agent.
For example, the real impact of k8s disconnection for BGP Control Plane is the user's configuration change may not be propagated in timely manner. BGP works without any problem otherwise. I'd assume most of the time, users want to avoid connectivity loss instead of configuration delay.
We might have a subsystem which this behavior is desirable, so opt-in might be a safer option. However, in my personal opinion, we can just completely remove k8s connectivity from health criteria and fix such a subsystem to stop individually.
Let me CC @joestringer as a person who was involved in the original discussion and also as a core maintainer.
I've been thinking about this PR too :sweat_smile:
AFAIK controller that updates connectivity to apiserver is still running here: https://github.com/cilium/cilium/blob/0001f4b3fcf950ba6a3bb6fb0cba14931b340ce3/daemon/cmd/status.go#L804
So I would at least expect cilium-dbg status to show whether connectivity to k8s works, but a metric about the general status of Cilium connectivity to control-plane (K8s control plane / kvstore) would be nice too.
FYI, there is also a controller that is supposed to close all connections if the health check of the k8s apiserver fails: https://github.com/cilium/cilium/blob/c32980607c16cf6daddea300f8318e217e31bc34/pkg/k8s/client/cell.go#L290 so if this part works, there is also that fail-safe mechanism.
I'm not sure if there is a specific argument to keep kube-apiserver connectivity as part of the base health check.
There was some bug in the past with connectivity to kube-apiserver, that was only resolved with the restart of the agent: https://github.com/cilium/cilium/issues/20915 So while I also lean towards enabling it by default too, we still definitely need an option to retain previous behavior, just in case, for a transition period.
CC @hemanthmalla we talked about it before a little bit, I am guessing you might be interested in this too and probably have some thoughts as well. I guess you might want something similar for kvstore so potentially we might want to make it a bit more generic.
Isn't one of the assumptions users can make about the restart behavior of the cilium-agent is that connectivity will continue to work even as the cilium-agent is killed or restarted? I'm confused by the comments here because they assume there is a coupling between cilium agent availability and Pod network connectivity. Are we referring to "connectivity loss" as connectivity that is out of sync with user configured state in the k8s apiserver?
Isn't one of the assumptions users can make about the restart behavior of the cilium-agent is that connectivity will continue to work even as the cilium-agent is killed or restarted? I'm confused by the comments here because they assume there is a coupling between cilium agent availability and Pod network connectivity. Are we referring to "connectivity loss" as connectivity that is out of sync with user configured state in the k8s apiserver?
The issue is bgp advertisement. If cilium agent is offline when kube-api is unavailable, cilium also can't advertise routes to bgp peers so this effectively means kube-api availability now has impact on loadbalancer availability, for which there is no good reason.
Isn't one of the assumptions users can make about the restart behavior of the cilium-agent is that connectivity will continue to work even as the cilium-agent is killed or restarted?
The other aspect I'll point out is that this PR suggests that if the kube-apiserver goes down, that will force every cilium-agent instance in the cluster to also go down. Similar to how you're asserting that the dataplane should continue without the local control plane, I think it makes sense for the local control plane to continue without the central control plane.
What's the next step on this PR? I see a few comments suggesting to change some things. @tkna are you still working on this? Do you have any questions for reviewers?
@joestringer @marseel @YutaroHayakawa @ysksuzuki @ldelossa @learnitall
I made some minor corrections. As far as I know, the remaining discussions are as follows:
-
whether
--require-k8s-connectivityshould be implemented as an API parameter or a daemon config I just implemented it as an API parameter for now. I think implementing it as an API parameter is more flexible and easy to apply/rollback since it is configured by HTTP header and we can change it per request. -
whether
--require-k8s-connectivityshould be false or true by default I just implemented it as default:false for now, but the discussion is still open.
So while I also lean towards enabling it by default too, we still definitely need an option to retain previous behavior, just in case, for a transition period.
In my understanding, in @marseel's opinion --require-k8s-connevtivity should be true by default at this point.
-
whether we should generalize this mechanism (to exclude kvstore probe etc.) I don't think the generalization is necessary unless there is a requirement.
-
whether we should introduce a new metric As @marseel mentioned, connectivity to kube-apiserver can be checked by cilium-dbg status and logs from "k8s-heartbeat" mechanism, so I don't think the new metric is necessary at least for this PR. (can be another PR)
Thanks @tkna for the detailed summary.
whether --require-k8s-connectivity should be false or true by default I just implemented it as default:false for now, but the discussion is still open.
I think long-term it should be true by default, but it would require significantly more testing. I'm fine with having it as false by default for now and then following up with making it true.
@marseel
Just for clarification, --require-k8s-connectivity: true retains the existing behavior, and false makes cilium-dbg status(and livenessProbe) independent from the k8s probe.
Do you mean:
long-term: default: false (make cilium-dbg status independent from the k8s probe)
for now: default: true (retaining existing behavior)
?
yea, sorry I wasn't clear. I meant long-term k8s health-check would be disabled by default, but for now let's keep current behaviour.
@marseel
Fixed to keep the existing behavior by default.
Considering making a boolean option for cilium-dbg status, I changed the option's name from require-k8s-connectivity to without-k8s-connectivity to make it false by default.
If this PR seems to be OK, could you trigger E2E tests?
/test
@ovidiutirla Some E2E tests seemed to be failing for reasons unrelated to this PR. I just rebased it, so could you run the E2E tests again?
/test
Looks like @tkna has made all the requested changes as agreed on here, is there anything else that we need to do to keep this moving? @joestringer, @YutaroHayakawa, @marseel?
/test
@ldelossa Is there anything else I should do about this PR?
/test
/ci-ginkgo
/ci-runtime
/ci-integration
Thanks for the contribution @tkna !
@joestringer Do you think this comment should be modified since it is out of date? https://github.com/cilium/cilium/pull/32724#issue-2318459880
withoutK8sConnectivity
@tkna yes, that would be helpful. When this PR becomes part of a release, the release notes will point back to this PR. If a user is interested in the feature, they may open this PR and read the description.
@joestringer
https://github.com/cilium/cilium/pull/32724#issue-2318459880
modified to requireK8sConnectivity.