spilo
spilo copied to clipboard
Helm chart - use external etcd
Any way I can specify e.g. etcd-operator to be used with patroni chart instead of built in etcd?
tried to pin to etcd-operator created cluster svc Etcd.Host=etcd-cluster-client
, that did not work still patroni etcd was created.
as I tried resilience of of patroni etcd, which is not good, if etcd pod gets restarted/moved to another node it does not come up anymore:
kubectl logs patroni1-etcd-2
cat: can't open '/var/run/etcd/member_id': No such file or directory
Re-joining etcd member
I thing you need to set value of Etcd.Host
to the first Pod
of etcd cluster created by etcd-operator
https://github.com/coreos/etcd-operator#create-and-destroy-an-etcd-cluster
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
example-etcd-cluster-0000 1/1 Running 0 1m
example-etcd-cluster-0001 1/1 Running 0 1m
example-etcd-cluster-0002 1/1 Running 0 1m
In this example it would be example-etcd-cluster-0000
. Patroni will use it and discover all other nodes of etcd-cluster.
but if that first etcd pod gets destroyed, then etcd-operator creates new pod with the new name. not really HA setup svc is better to be used there
but if that first etcd pod gets destroyed, then etcd-operator creates new pod with the new name.
Will it? I've thought it will preserve original name and mimic so to say StatefulSet behaviour.
svc is better to be used there
It also could work. You can create kubernetes Service with labelSelector finding all Pods of etcd-cluster and specify such service in the Etcd.Host
.
At the end Patroni will anyway use such Service only once, to get a topology of etcd-cluster and later it will connect to event node individually.
no, it does not mimic StatefulSet behaviour
I already tried to use to etcd-operator created cluster svc Etcd.Host=etcd-cluster-client
, that did not work still patroni etcd was created.
that's not good approach to be used with etcd-operator:
At the end Patroni will anyway use such Service only once, to get a topology of etcd-cluster and later it will connect to event node individually.
as etcd-operator always recreates a new pod with the new name
as etcd-operator always recreates a new pod with the new name
Patroni is much smarter than you think. If the "Pod" it connected to has failed, it will switch to another "Pod" and rediscover topology of etcd cluster. If nothing is failing, it will refresh topology every 5 minutes. If all Pods failed at the same time, Patroni will go back to the original ETCD_HOST specified in the configuration. If it points to the Service - everything will be fine. Basically you can rotate all etcd Pods and Patroni will survive.
ok, cool then but why it did not connect to SVC of the etcd-operator created cluster?
Does etcd operator creates service?
yup, as you can see below it is etcd-cluster-client
.
$ k get service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
etcd-cluster None <none> 2379/TCP,2380/TCP 2h
etcd-cluster-client 10.3.0.149 <none> 2379/TCP 2h
$ k describe svc etcd-cluster-client
Name: etcd-cluster-client
Namespace: spcqm-system
Labels: app=etcd
etcd_cluster=etcd-cluster
Annotations: service.alpha.kubernetes.io/tolerate-unready-endpoints=true
Selector: app=etcd,etcd_cluster=etcd-cluster
Type: ClusterIP
IP: 10.3.0.149
Port: client 2379/TCP
Endpoints: 10.2.2.11:2379,10.2.3.16:2379,10.2.4.13:2379
Session Affinity: None
Events: <none>
$ k get pods -l app=etcd -o wide
NAME READY STATUS RESTARTS AGE IP NODE
etcd-cluster-0000 1/1 Running 0 2h 10.2.4.13 xxx
etcd-cluster-0001 1/1 Running 0 2h 10.2.3.16 xxx
etcd-cluster-0002 1/1 Running 0 2h 10.2.2.11 xxx
And what curl http://etcd-cluster-client:2379/v2/machines
shows?
You need to execute it from one of the pods (for example one of patroni pods)
root@patroni3-patroni-0:/home/postgres# curl http://etcd-cluster-client:2379/v2/machines
http://etcd-cluster-0000.etcd-cluster.spcqm-system.svc:2379, http://etcd-cluster-0001.etcd-cluster.spcqm-system.svc:2379, http://etcd-cluster-0002.etcd-cluster.spcqm-system.svc:2379
looks good there
Looks good.
Is http://etcd-cluster-0000.etcd-cluster.spcqm-system.svc:2379 accessible from patroni pod?
And what echo $ETCD_HOST
shows?
root@patroni3-patroni-0:/home/postgres# env | grep ETCD_HOST
ETCD_HOST=etcd-cluster-client
root@patroni3-patroni-0:/home/postgres# curl http://etcd-cluster-0000.etcd-cluster.spcqm-system.svc:2379
404 page not found
etcd-operator is installed to the same namespace as patroni
DNS check of the POD is fine:
kubectl exec busybox -- nslookup etcd-cluster-0000.etcd-cluster.spcqm-system.svc
Server: 10.3.0.10
Address 1: 10.3.0.10 kube-dns.kube-system.svc.cluster.local
Name: etcd-cluster-0000.etcd-cluster.spcqm-system.svc
Address 1: 10.2.4.13 etcd-cluster-0000.etcd-cluster.spcqm-system.svc.cluster.local
Everything looks good. Patroni is configured to use etcd cluster deployed by etcd operator.
Now I am completely lost and don't understand what you problem is.
it is more patroni related issue, etcd-operator is functioning fine, I do not have RBAC
enabled there
it is more patroni related issue
Not really Patroni issue, but patroni helm chart. I am not really familiar with helm chart internals, but it seem Patroni chart has etcd as dependency: https://github.com/kubernetes/charts/blob/master/incubator/patroni/requirements.yaml
chart’s readme says that etcd_host is not used
i will play remove that dependency tomorrow, but if that env var is not used by patroni patroni should fail
chart’s readme says that etcd_host is not used
Looking on chart internals (https://github.com/kubernetes/charts/blob/master/incubator/patroni/templates/statefulset-patroni.yaml#L49) I can tell that it is definitely used and propagated to the StatfulSet and underlying Pods. Readme is just wrong, sorry about that, I am not maintainer of Patroni helm chart. You can create a pull request updating helm chart documentation.
P.S. I am working on Patroni kubernetes native deployment: https://github.com/zalando/patroni/pull/500 It makes it possible to deploy Patroni on kubernetes without etcd. If you have time please try it.
sure will play with the chart tomorrow and also will check that stuff too
On Wed, 4 Oct 2017 at 20:31, Alexander Kukushkin [email protected] wrote:
chart’s readme says that etcd_host is not used
Looking on chart internals ( https://github.com/kubernetes/charts/blob/master/incubator/patroni/templates/statefulset-patroni.yaml#L49) I can tell that it is definitely used and propagated to the StatfulSet and underlying Pods. Readme is just wrong, sorry about that, I am not maintainer of Patroni helm chart. You can create a pull request updating helm chart documentation.
P.S. I am working on Patroni kubernetes native deployment: zalando/patroni#500 https://github.com/zalando/patroni/pull/500 It makes it possible to deploy Patroni on kubernetes without etcd. If you have time please try it.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zalando/spilo/issues/195#issuecomment-334264737, or mute the thread https://github.com/notifications/unsubscribe-auth/AE-Uo5fPRCW7u3LO5Jc7W81fyqXTl02Xks5so90EgaJpZM4Pt0Ee .
@CyberDem0n is the Patroni kubernetes native deployment and https://github.com/zalando-incubator/postgres-operator the same thing?
No, postgres-operator is a tool similar to the etcd-operator.
interesting, you guys have two new projects to run postgres in kube
now I'm not sure which one to stick to
Actually not two, but three.
Patroni - does all heavy lifting, like automatic failover and so one. Can work on bare metal and inside docker. Spilo - this is a docker package of Patroni+PostgreSQL+wal-e+some other useful stuff. postgres-operator - deploys Spilo on kubernetes using third party resources