etcd
etcd copied to clipboard
TestClusterOf1UsingDiscovery doesn't happy if there is last release of etcd
Which github workflows are flaking?
no
Which tests are flaking?
TestClusterOf1UsingDiscovery
Github Action link
no
Reason for failure (if possible)
https://github.com/etcd-io/etcd/blob/8de14bd36e846eef7bc0d66924a28e1352bb5a72/tests/e2e/discovery_test.go#L56
When we enable cluster_proxy as tags, the ProxyV2 doesn't have a chance to start because we just define it. And in our testing workflow test-grpcproxy-e2e, we don't download last release binary so that the case never run. It only happens in local test.
Anything else we need to know?
If the ProxyV2 is not used any more, I think we should delete it and add the !cluster_proxy tag for the test.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
Discussed during sig-etcd triage. Assigned to @ivanvc to initially try to recreate the failure.
/assign @ivanvc
I left this test case running with stress, and after ~10,000 iterations (~3 hours), it didn't fail.
I ran this against v3.5.12. So, I assume the flake is no longer valid with the most recent version.
Should we close this issue? Or do we want to verify with someone else that it's not happening anymore?
3h26m25s: 10136 runs so far, 0 failures
cc. @jmhbnz
Thanks @ivanvc - Question for @fuweid do you think we can close this one now? Or is there still a flake here we are missing?
Sorry for late reply. This issue was found when I was trying to upgrade grpc-gateway deps. Sorry for the unclear comment.
The reproduce step is here
$ git checkout main
$ git clean -dxf
$ make
$ PASSES=release scripts/test.sh # download 3.5.x
$ cd tests
$ go test -v ./e2e -tags cluster_proxy -run TestClusterOf1UsingDiscovery
=== RUN TestClusterOf1UsingDiscovery
before.go:36: Changing working directory to: /tmp/TestClusterOf1UsingDiscovery908449421/001
logger.go:146: 2024-03-30T16:39:35.713+0800 INFO starting server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-03-30T16:39:35.713+0800 INFO spawning process {"args": ["/home/fuweid/workspace/etcd/bin/etcd-last-release", "--name=TestClusterOf1UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery908449421/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf1UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
logger.go:146: 2024-03-30T16:39:36.458+0800 INFO started server. {"name": "TestClusterOf1UsingDiscovery-test-0", "pid": 15195}
logger.go:146: 2024-03-30T16:39:36.458+0800 INFO spawning process {"args": ["/home/fuweid/workspace/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery908449421/002"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
discovery_test.go:61: dial tcp 127.0.0.1:2002: connect: connection refused
logger.go:146: 2024-03-30T16:39:36.480+0800 INFO closing test cluster...
logger.go:146: 2024-03-30T16:39:36.481+0800 INFO stopping server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-03-30T16:39:36.491+0800 INFO stopped server. {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-03-30T16:39:36.491+0800 INFO closing server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-03-30T16:39:36.491+0800 INFO removing directory {"data-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/002"}
logger.go:146: 2024-03-30T16:39:36.492+0800 INFO closed test cluster.
--- FAIL: TestClusterOf1UsingDiscovery (0.78s)
FAIL
FAIL go.etcd.io/etcd/tests/v3/e2e 0.795s
FAIL
I was thinking that this case doesn't work if there is last release binary. We should skip it instead of running.
When we enable cluster_proxy as tags, the ProxyV2 doesn't have a chance to start because we just define it. And in our testing workflow test-grpcproxy-e2e, we don't download last release binary so that the case never run. It only happens in local test.
@fuweid, thanks for the detailed steps for reproducing. I can confirm that it does fail when running with the cluster_proxy tag.
As hinted by @fuweid, I can also confirm that all the test cases from e2e/discovery_test.go fail when the cluster_proxy tag is set.
go test -v ./e2e -tags cluster_proxy -run 'Test(TLS)?ClusterOf[13]UsingDiscovery'
=== RUN TestClusterOf1UsingDiscovery
before.go:36: Changing working directory to: /tmp/TestClusterOf1UsingDiscovery2671669508/001
logger.go:146: 2024-04-01T21:47:01.421-0700 INFO starting server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:01.421-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestClusterOf1UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery2671669508/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf1UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
logger.go:146: 2024-04-01T21:47:02.144-0700 INFO started server. {"name": "TestClusterOf1UsingDiscovery-test-0", "pid": 438584}
logger.go:146: 2024-04-01T21:47:02.145-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery2671669508/002"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
logger.go:146: 2024-04-01T21:47:02.156-0700 INFO closing test cluster...
logger.go:146: 2024-04-01T21:47:02.157-0700 INFO stopping server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:02.170-0700 INFO stopped server. {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:02.170-0700 INFO closing server... {"name": "TestClusterOf1UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:02.170-0700 INFO removing directory {"data-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/002"}
logger.go:146: 2024-04-01T21:47:02.176-0700 INFO closed test cluster.
--- FAIL: TestClusterOf1UsingDiscovery (0.76s)
=== RUN TestClusterOf3UsingDiscovery
before.go:36: Changing working directory to: /tmp/TestClusterOf3UsingDiscovery3135499303/001
logger.go:146: 2024-04-01T21:47:02.177-0700 INFO starting server... {"name": "TestClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:02.177-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestClusterOf3UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf3UsingDiscovery3135499303/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf3UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/001", "name": "TestClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
logger.go:146: 2024-04-01T21:47:03.204-0700 INFO started server. {"name": "TestClusterOf3UsingDiscovery-test-0", "pid": 438642}
logger.go:146: 2024-04-01T21:47:03.204-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf3UsingDiscovery3135499303/002"], "working-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/001", "name": "TestClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
logger.go:146: 2024-04-01T21:47:03.216-0700 INFO closing test cluster...
logger.go:146: 2024-04-01T21:47:03.216-0700 INFO stopping server... {"name": "TestClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:03.229-0700 INFO stopped server. {"name": "TestClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:03.229-0700 INFO closing server... {"name": "TestClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:03.229-0700 INFO removing directory {"data-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/002"}
logger.go:146: 2024-04-01T21:47:03.235-0700 INFO closed test cluster.
--- FAIL: TestClusterOf3UsingDiscovery (1.06s)
=== RUN TestTLSClusterOf3UsingDiscovery
before.go:36: Changing working directory to: /tmp/TestTLSClusterOf3UsingDiscovery294755773/001
logger.go:146: 2024-04-01T21:47:03.236-0700 INFO starting server... {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:03.236-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestTLSClusterOf3UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestTLSClusterOf3UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/001", "name": "TestTLSClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
logger.go:146: 2024-04-01T21:47:04.162-0700 INFO started server. {"name": "TestTLSClusterOf3UsingDiscovery-test-0", "pid": 438699}
logger.go:146: 2024-04-01T21:47:04.162-0700 INFO spawning process {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002"], "working-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/001", "name": "TestTLSClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
logger.go:146: 2024-04-01T21:47:04.173-0700 INFO closing test cluster...
logger.go:146: 2024-04-01T21:47:04.173-0700 INFO stopping server... {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:04.185-0700 INFO stopped server. {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:04.185-0700 INFO closing server... {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
logger.go:146: 2024-04-01T21:47:04.185-0700 INFO removing directory {"data-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002"}
logger.go:146: 2024-04-01T21:47:04.192-0700 INFO closed test cluster.
--- FAIL: TestTLSClusterOf3UsingDiscovery (0.96s)
FAIL
FAIL go.etcd.io/etcd/tests/v3/e2e 2.778s
FAIL
@jmhbnz, are we okay with ignoring e2e/discovery_test.go when the cluster_proxy tag is set? Do we want someone else to weigh in (Benjamin or Marek)?
I think we can just add tag !cluster_proxy to disable its
I think we can just add tag !cluster_proxy to disable
Agreed. Please feel free to raise a pr if you have time @ivanvc.