Kubernetes leader split brain in Blue-Green Deployment
Describe the bug Currently Kubernetes leader lib uses config-map to provide the leadership to one of the pods in a namespace, this works fine with single cluster deployment. When we've blue green deployment in same cluster in different namespaces as blue and green.
When we do blue-green deployment, keep one of the deployments on stand-by then there are two leaders in each namespace which leads to split brain and now there are two leaders which are doing the same job.
Please let us know how can we fix this.
Can you provide a complete, minimal, verifiable sample that reproduces the problem? It should be available as a GitHub (or similar) project or attached to this issue as a zip file.
@ryanjbaxter
I'm using the Kubernetes example as the application
https://github.com/spring-cloud/spring-cloud-kubernetes/tree/main/spring-cloud-kubernetes-examples/kubernetes-leader-election-example
Attached are two deployments leadertest1 namespace is blue deployment and leadertest2 namespace is green deployment. Traffic to both deployment managed by ingress.
Now run below commands and see leader in each namespace:
curl localhost:28756 - provides the o/p one of the blue deployment pods as leader. curl localhost:28757 - provides the o/p as one of the green deployment pods as leader.
This basically leads to two leaders in one environment.
I'm looking for a way to configure common locking between the blue and green deployments so that there is always only one leader.
@ryanjbaxter
I've tried with providing the namespace property but still seeing two leaders with below error in logs
{"t":"2022-01-10T16:55:58.926Z","msg":"Invalid event type","lgr":"io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager","trd":"OkHttp https://172.20.0.1/...","lvl":"ERROR","stacktrace":"java.lang.IllegalArgumentException: Item needs to be one of [Node, Deployment, ReplicaSet, StatefulSet, Pod, ReplicationController[], but was: [Unknown (null)]\n\tat io.fabric8.kubernetes.client.internal.readiness.Readiness.isReady(Readiness.java:62)\n\tat io.fabric8.kubernetes.client.dsl.base.BaseOperation.isReady(BaseOperation.java:1104)\n\tat org.springframework.cloud.kubernetes.leader.LeadershipController.isPodReady(LeadershipController.java:238)\n\tat org.springframework.cloud.kubernetes.leader.LeadershipController.update(LeadershipController.java:81)\n\tat org.springframework.cloud.kubernetes.leader.LeaderRecordWatcher.eventReceived(LeaderRecordWatcher.java:85)\n\tat org.springframework.cloud.kubernetes.leader.LeaderRecordWatcher.eventReceived(LeaderRecordWatcher.java:30)\n\tat io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49)\n\tat io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:236)\n\tat okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:322)\n\tat okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)\n\tat okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)\n\tat okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)\n\tat okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)\n\tat okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)\n\tat okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)\n\tat java.base/java.lang.Thread.run(Unknown Source)\n"}
@ryanjbaxter
I've found the issue is with below piece of code
LeadershipController - update API if (leader != null && this.isPodReady(leader.getId())) { this.handleLeaderChange(leader); }
Issue is happening because of the following reason:
- Spring Kubernetes leader provides a way to configure namespace of the config map
- We've two deployments i.e., Blue and Green deployment in two different namespaces in same cluster
- Created cross namespace role binding to provide access for config map and pod across two namespaces blue and green
- Once we deploy to blue and green namespaces
- one of them gets the leadership access let's assume blue namespace pod got leadership
- Green pod has to revoke the leadership using through leadershipcontrller.update api but this api is failing with below error because isPodReady api checks whether leader is ready or not in the current namespace but this case leader is in blue namespace
There should be a way to enable Blue-Green mode and provide namespaces of deployments and namespace for configmap.
@ryanjbaxter We are highly dependent on the leader as we are using it for doing the job at-least and utmost once, Kindly help us with this.
Its hard to say for sure because I am not terribly familiar with the code, but if the problem is just due to the fact that the is ready check is only happening in the current namespace we could modify the method to look across all namespaces
private boolean isPodReady(String name) {
return this.kubernetesClient.pods().inAnyNamespace().withField("metadata.name",name).list().getItems().stream().anyMatch(p -> p.getStatus().getContainerStatuses().stream().anyMatch(containerStatus -> containerStatus.getReady()));
}
If you have some time to try that out and see if it would fix the problem that would be very helpful
@ryanjbaxter this looks like working, i'm still testing.
If this fix works, i would like to understand the below points
will this fix be available in 1.0.x version release? When is the next release date?
Great please let me know how your testing goes.
It will not go in the 1.0.x branch, we no longer support that branch, it would go into 2.0.x, 2.1.x, and 3.0.x branches.
@Srinivas-Karre can you please let me know if this change worked?
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.
@ryanjbaxter As far as i remember the fix was not working in few cases so, we've decided to use the leader use-case in the same namespace.