etcd icon indicating copy to clipboard operation
etcd copied to clipboard

Peer rtt unreasonably large

Open freedge opened this issue 10 months ago • 0 comments

Bug report criteria

  • [X] This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
  • [X] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
  • [X] You have read the etcd bug reporting guidelines.
  • [X] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

this is a reopening of https://github.com/etcd-io/etcd/issues/11100

the ROUND_TRIPPER_RAFT_MESSAGE probing opens a new connection at each probe and therefore is not really computing the RTT

What did you expect to happen?

the ROUND_TRIPPER_RAFT_MESSAGE probing should happen on an existing connection. etcd_network_peer_round_trip_time_seconds metrics should reflect the actual RTT

as per https://etcd.io/docs/v3.5/op-guide/performance/

The RTT within a datacenter may be as long as several hundred microseconds.

this is not what is read in etcd_network_peer_round_trip_time_seconds

How can we reproduce it (as minimally and precisely as possible)?

run a cluster

tcpdump port 2380 and 'tcp[tcpflags] & tcp-syn == tcp-syn'

Anything else we need to know?

No response

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.6.0-alpha.0
Git SHA: 2674f94c
Go Version: go1.22.1 (Red Hat 1.22.1-1.el9)
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.6.0-alpha.0
API version: 3.6

Etcd configuration (command line flags or environment variables)

in local: ``` etcd --name infra0 --initial-advertise-peer-urls http://127.0.0.10:2380 \ --listen-peer-urls http://127.0.0.10:2380 \ --listen-client-urls http://127.0.0.10:2379,http://127.0.0.1:2379 \ --advertise-client-urls http://127.0.0.10:2379 \ --initial-cluster-token etcd-cluster-1 \ --initial-cluster infra0=http://127.0.0.10:2380,infra1=http://127.0.0.11:2380,infra2=http://127.0.0.12:2380 \ --initial-cluster-state new \ --log-level debug --log-outputs stdout ```

also reproduced in OpenShift 4.14

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
# paste output here

$ etcdctl --endpoints=<member list> endpoint status -w table
# paste output here

Relevant log output

No response

freedge avatar Apr 22 '24 07:04 freedge