When on pod failure, reuse data volume.
Problem
Currently when a pod fails, operator will create a new pod, then let the new node join the cluster, and sync all data to the new node.
But it is very inconvenient, in case of 3 node cluster, if there are some other node fail during the process, the cluster will be unrecoverable.
We have encountered the problem that under heavy load, master node may stop working for a short period, when it happens, there is an chance that the pod will be kill be liveness check of k8s, if 2 nodes are killed back to back, the cluster wont recover.
Posible solution: restartPolicy of K8S
Using restartPolicy will restart the container when etcd fails. I know there are some problem to solve of using restartPolicy.
When the container is restarted by k8s, it use the original arguments, but to restart the node, requires different arguments. Maybe we can solve the problem by using discovery service. I don't know if discovery service still works after the cluster has been negotiated. But it will be pretty sure that operator must change the discovery key when adding/removing nodes.
Posible solution: use PV
Some times PV is slow, and the extra latency may reduce the performance greatly. So I list it as a solution with some trade off.