fdb-kubernetes-operator
fdb-kubernetes-operator copied to clipboard
Knob rollout can be delayed by coordinator changes
What happened?
We have an e2e pipeline currently that tests how long it takes to rollout a knob. In this test case we noticed that a coordinator change will delay the rollout. The reason for this is that updatePodDynamicConf checks if the fdb.cluster and fdbmonitor.conf are updated. If coordinators are changed the content of the fdb.cluster file will change. This is especially and issue for HA clusters where the knob rollout is done by the independent operator instances. We have to validate if it's safe to ignore the fdb.cluster update in those cases.
What did you expect to happen?
I would expect that the knob rollout is not affected by changing coordinators.
How can we reproduce it (as minimally and precisely as possible)?
Create a cluster, change a knob and do a replacement after that to trigger a coordinator change or use an HA cluster an do a knob change there.
Anything else we need to know?
No response
FDB Kubernetes operator
$ kubectl fdb version
latest
Kubernetes version
$ kubectl version
v1.22.11
Cloud provider
I have an idea how to solve this with a. change in our restart procedure. I'll put a design doc together next week.
I change this from bug to documentation, this is currently a limitation of the design and changing the design will require more work.