etcd3
etcd3 copied to clipboard
Election example from doc elects 2 leaders after etcd restart, starts multiple workers, or fails to elect a leader
Gist of documented example with seemingly minor benign changes so it will execute.
Versions
- Ubuntu 20.04.3 LTS
- node v16.13.1
- Etcd3 1.1.0
- etcd 3.5.1
- docker image bitnami/etcd tag:latest imageId:57a06bf0a041
- docker 20.10.12
How to reproduce
- 'docker pull/run, see below'
- start 2 instances of example node.js code and wait for election, each has a unique UUID, one will be elected and do work
- 'docker ps to get CONTAINER ID'
- 'docker stop CONTAINER ID' and wait for revoke event and work stops
- depending on how long etcd is down, the symptoms vary, see below
- 'docker start CONTAINER ID' wait for election and both get elected and do work
Symptoms
The symptoms after brief etcd outage < 10 seconds Restart etcd after EtcdLeaseInvalidError
- only one leader is elected, but multiple workers are doing work on the leader - what?
The symptoms after a medium etcd outage ~30 sesconds Restart etcd after EtcdLeaseInvalidError and GRPCUnavailableError
- both node.js processes are leaders and multiple workers are doing work
The symptoms after a long etcd outage, ~few minutes Restart etcd after EtcdLeaseInvalidError and GRPCUnavailableError and BrokenCircuitError and GRPCResourceExhastedError
- both node.js processes spew errors endlessly tripping circuit breaker and never recover
docker pull/run
docker run -d --publish 2379:2379 --publish 2380:2380 --env ALLOW_NONE_AUTHENTICATION=yes --env ETCD_ADVERTISE_CLIENT_URLS=http://etcd-server:2379 bitnami/etcd:latest