Guess etcd replicas number function
According to the latest meeting 2024-06-18 MINUTES we decided that we need a function that guesses the needed amount of etcd replicas.
It can be used for recovering non-exising STS object and also for scaling from 0 Design ref: https://github.com/aenix-io/etcd-operator/pull/181
Proposal:
- Create variable
guessed=0 - Check cluster-state configmap
- if configmap exists and
initial-cluster-membersdefined- if there are any hostnames defined in
initial-cluster-members- take the hostname of pod with highest number and +1
- save value into
guessedvariable
- save value into
- take the hostname of pod with highest number and +1
- if there are any hostnames defined in
- if configmap exists and
- Check endpoins for etcd-headless service
- if there are any endpoints
- connect to the cluster using endpoint and collect information from
member list- if there are any members in output from etcd
- take the hostname with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the hostname with highest number and +1
- if there are any members in output from etcd
- read endpoints from kubernetes object:
- take the hostname of the pod for endpoint with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the hostname of the pod for endpoint with highest number and +1
- connect to the cluster using endpoint and collect information from
- if there are any endpoints
- read persistent volume claims that falls under StatefulSet label selector
- if there are any pvcs
- take the name of the pvc with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the name of the pvc with highest number and +1
- if there are any pvcs
- read pods pods that falls under StatefulSet label selector
- if there are any pods
- take the pod name with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the pod name with highest number and +1
- if there are any pods
- return
guessed
I would definitely like to drop these steps altogether.
Check cluster-state configmap
if configmap exists and
initial-cluster-membersdefined
if there are any hostnames defined in
initial-cluster-members
take the hostname of pod with highest number and +1
- save value into
guessedvariable
This seems redundant, as we already have this info from checking the Endpoints object:
read pods pods that falls under StatefulSet label selector
if there are any pods
take the pod name with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
I don't like this step at all:
if value is greater then value in guessed, save value into guessed variable
IMO, if we found a value from a reliable source, such as member list, we should never fall back to a less reliable source, such as "number of endpoints". Only if the more reliable source is unavailable (e.g. we cannot get member list due to lack of quorum), should we try guessing the right number of replicas from Endpoints or PVCs.
@lllamnyp
I would definitely like to drop these steps:
Check cluster-state configmap
it is created at initial and keeps existing all the time. It should always contain correct infromation, until someone will remove it, why no using it?
read pods pods that falls under StatefulSet label selector This seems redundant, as we already have this info from checking the Endpoints object
Are all our pods always get into service endpoints? If so it can be omitted. Also is there any chance that by running this check service and endpoints will not be exising?
If we consider member list as reliable source, then you're right, let's return it directly
v2:
- Create variable
guessed=0 - Check endpoins for etcd-headless service
- if there are any endpoints
- connect to the cluster using endpoint and collect information from
member list- if there are any members in output from etcd
- take the hostname with highest number and +1
- return value
- take the hostname with highest number and +1
- if there are any members in output from etcd
- read endpoints from kubernetes object:
- take the hostname of the pod for endpoint with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the hostname of the pod for endpoint with highest number and +1
- connect to the cluster using endpoint and collect information from
- if there are any endpoints
- Check cluster-state configmap
- if configmap exists and
initial-cluster-membersdefined- if there are any hostnames defined in
initial-cluster-members- take the hostname of pod with highest number and +1
- save value into
guessedvariable
- save value into
- take the hostname of pod with highest number and +1
- if there are any hostnames defined in
- if configmap exists and
- read persistent volume claims that falls under StatefulSet label selector
- if there are any pvcs
- take the name of the pvc with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the name of the pvc with highest number and +1
- if there are any pvcs
- return
guessed
Etcd-headless service will always have endpoints - it doesn't rely on readiness probes => so all created pods with ip addresses will be in the headless-service. This service is ensured in the very beginning => so it must exist.
I personally do not like checking cluster-state configmap because in the past we agreed that this is some kind of cache and it would be nice to get this info from etcd pvcs. So amount of pvcs in my opinion is more reliable source than cluster-state cm. So cm can be checked but as a last resort.
Okay it seems cluster-state configmap check makes no sense, so removed:
v3:
- Create variable
guessed=0 - Check endpoins for etcd-headless service
- if there are any endpoints
- connect to the cluster using endpoint and collect information from
member list- if there are any members in output from etcd
- take the hostname with highest number and +1
- return value
- take the hostname with highest number and +1
- if there are any members in output from etcd
- read endpoints from kubernetes object:
- take the hostname of the pod for endpoint with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the hostname of the pod for endpoint with highest number and +1
- connect to the cluster using endpoint and collect information from
- if there are any endpoints
- read persistent volume claims that falls under StatefulSet label selector
- if there are any pvcs
- take the name of the pvc with highest number and +1
- if value is greater then value in
guessed, save value intoguessedvariable
- if value is greater then value in
- take the name of the pvc with highest number and +1
- if there are any pvcs
- return
guessed
Okay it seems
cluster-stateconfigmap check makes no sense, so removed:v3:
- return
guessed
LGTM
This function is tentatively implemented here as
func (o *observables) desiredReplicas() (max int) {}