etcd
etcd copied to clipboard
Peer rtt unreasonably large
Bug report criteria
- [X] This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
- [X] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
- [X] You have read the etcd bug reporting guidelines.
- [X] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
this is a reopening of https://github.com/etcd-io/etcd/issues/11100
the ROUND_TRIPPER_RAFT_MESSAGE probing opens a new connection at each probe and therefore is not really computing the RTT
What did you expect to happen?
the ROUND_TRIPPER_RAFT_MESSAGE probing should happen on an existing connection. etcd_network_peer_round_trip_time_seconds metrics should reflect the actual RTT
as per https://etcd.io/docs/v3.5/op-guide/performance/
The RTT within a datacenter may be as long as several hundred microseconds.
this is not what is read in etcd_network_peer_round_trip_time_seconds
How can we reproduce it (as minimally and precisely as possible)?
run a cluster
tcpdump port 2380 and 'tcp[tcpflags] & tcp-syn == tcp-syn'
Anything else we need to know?
No response
Etcd version (please run commands below)
$ etcd --version
etcd Version: 3.6.0-alpha.0
Git SHA: 2674f94c
Go Version: go1.22.1 (Red Hat 1.22.1-1.el9)
Go OS/Arch: linux/amd64
$ etcdctl version
etcdctl version: 3.6.0-alpha.0
API version: 3.6
Etcd configuration (command line flags or environment variables)
in local:
```
etcd --name infra0 --initial-advertise-peer-urls http://127.0.0.10:2380 \
--listen-peer-urls http://127.0.0.10:2380 \
--listen-client-urls http://127.0.0.10:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://127.0.0.10:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster infra0=http://127.0.0.10:2380,infra1=http://127.0.0.11:2380,infra2=http://127.0.0.12:2380 \
--initial-cluster-state new \
--log-level debug --log-outputs stdout
```
also reproduced in OpenShift 4.14
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
# paste output here
$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here
Relevant log output
No response