piraeus-operator
piraeus-operator copied to clipboard
Controller not able to run after node failure
Testing node failure on a 2 node cluster, the piraeus controller was not able to go back up again after being rescheduled to the remaining healthy node. The error was about the etc database:
20:05:29.276 [Main] INFO LINSTOR/Controller - SYSTEM - Initializing the etcd database
20:06:29.725 [Main] ERROR LINSTOR/Controller - SYSTEM - Database initialization error [Report number 60C27085-00000-000000]
Doing the same test on a 3 node cluster, the recovery was ok. Does etcd need to run on minimum 3 replicas? I am thinking 1 replica is not enought becuase If I run etcd with 1 replica and host path volumes, it will never go up again if the node with the etcd pods fails. Is this correct?
In the 3 node scenario is it guaranteed that the etcd will always come back OK independently of which node fails? I am finding this etcd using volumes from another storage provider a complicated thing to deal with. Can anyone provide example of production usage of the linstor-operator regarding the etcd statefulset?
Hi! You are correct with your assumptions, etcd will only allow (write) access to the database if a majority of nodes are available. In the 2 node scenario exactly half of the nodes are available, which is not enough for etcd. This is done to prevent a situation were a network failure would mean that both etcd instance could start writing because they think they are the majority.
In this scenario, a 3 node cluster will continue to work as long as only one node fails, regardless of which node.
We are also dissatisfied with the state of affairs. I can't promise anything but we are investigating if using the Kubernetes API as a datastore would be feasible. In that case, you would no longer need etcd and extra storage volumes.
Hello and thank you for the clarification.
Using k8s API would be a very good solution. Hope it works!
FYI, calico seems to have taken a similar approach - https://docs.projectcalico.org/getting-started/kubernetes/hardway/the-calico-datastore
Using the k8s
backend has been possible since v1.7.0, released back in december 2021.