etcd icon indicating copy to clipboard operation
etcd copied to clipboard

TestClusterOf1UsingDiscovery doesn't happy if there is last release of etcd

Open fuweid opened this issue 2 years ago • 1 comments

Which github workflows are flaking?

no

Which tests are flaking?

TestClusterOf1UsingDiscovery

Github Action link

no

Reason for failure (if possible)

https://github.com/etcd-io/etcd/blob/8de14bd36e846eef7bc0d66924a28e1352bb5a72/tests/e2e/discovery_test.go#L56

When we enable cluster_proxy as tags, the ProxyV2 doesn't have a chance to start because we just define it. And in our testing workflow test-grpcproxy-e2e, we don't download last release binary so that the case never run. It only happens in local test.

Anything else we need to know?

If the ProxyV2 is not used any more, I think we should delete it and add the !cluster_proxy tag for the test.

fuweid avatar Jun 16 '23 15:06 fuweid

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 17 '23 23:09 stale[bot]

Discussed during sig-etcd triage. Assigned to @ivanvc to initially try to recreate the failure.

/assign @ivanvc

jmhbnz avatar Mar 28 '24 18:03 jmhbnz

I left this test case running with stress, and after ~10,000 iterations (~3 hours), it didn't fail. I ran this against v3.5.12. So, I assume the flake is no longer valid with the most recent version. Should we close this issue? Or do we want to verify with someone else that it's not happening anymore?

3h26m25s: 10136 runs so far, 0 failures

cc. @jmhbnz

ivanvc avatar Mar 28 '24 22:03 ivanvc

Thanks @ivanvc - Question for @fuweid do you think we can close this one now? Or is there still a flake here we are missing?

jmhbnz avatar Mar 29 '24 17:03 jmhbnz

Sorry for late reply. This issue was found when I was trying to upgrade grpc-gateway deps. Sorry for the unclear comment.

The reproduce step is here

$ git checkout main
$ git clean -dxf
$ make
$ PASSES=release scripts/test.sh # download 3.5.x
$ cd tests
$ go test -v ./e2e -tags cluster_proxy -run TestClusterOf1UsingDiscovery
=== RUN   TestClusterOf1UsingDiscovery
    before.go:36: Changing working directory to: /tmp/TestClusterOf1UsingDiscovery908449421/001
    logger.go:146: 2024-03-30T16:39:35.713+0800 INFO    starting server...      {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-03-30T16:39:35.713+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd-last-release", "--name=TestClusterOf1UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery908449421/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf1UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    logger.go:146: 2024-03-30T16:39:36.458+0800 INFO    started server. {"name": "TestClusterOf1UsingDiscovery-test-0", "pid": 15195}
    logger.go:146: 2024-03-30T16:39:36.458+0800 INFO    spawning process        {"args": ["/home/fuweid/workspace/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery908449421/002"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files (x86)/Microsoft SDKs/Azure/CLI2/wbin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/Program Files/Go/bin:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/weifu/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/weifu/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/weifu/go/bin:/home/fuweid/.fzf/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin:/usr/local/go/bin:/home/fuweid/go/bin:/opt/bin:/opt/fuwei/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    discovery_test.go:61: dial tcp 127.0.0.1:2002: connect: connection refused
    logger.go:146: 2024-03-30T16:39:36.480+0800 INFO    closing test cluster...
    logger.go:146: 2024-03-30T16:39:36.481+0800 INFO    stopping server...      {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-03-30T16:39:36.491+0800 INFO    stopped server. {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-03-30T16:39:36.491+0800 INFO    closing server...       {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-03-30T16:39:36.491+0800 INFO    removing directory      {"data-dir": "/tmp/TestClusterOf1UsingDiscovery908449421/002"}
    logger.go:146: 2024-03-30T16:39:36.492+0800 INFO    closed test cluster.
--- FAIL: TestClusterOf1UsingDiscovery (0.78s)
FAIL
FAIL    go.etcd.io/etcd/tests/v3/e2e    0.795s
FAIL

I was thinking that this case doesn't work if there is last release binary. We should skip it instead of running.

When we enable cluster_proxy as tags, the ProxyV2 doesn't have a chance to start because we just define it. And in our testing workflow test-grpcproxy-e2e, we don't download last release binary so that the case never run. It only happens in local test.

fuweid avatar Mar 30 '24 08:03 fuweid

@fuweid, thanks for the detailed steps for reproducing. I can confirm that it does fail when running with the cluster_proxy tag.

As hinted by @fuweid, I can also confirm that all the test cases from e2e/discovery_test.go fail when the cluster_proxy tag is set.

go test -v ./e2e -tags cluster_proxy -run 'Test(TLS)?ClusterOf[13]UsingDiscovery'
=== RUN   TestClusterOf1UsingDiscovery
    before.go:36: Changing working directory to: /tmp/TestClusterOf1UsingDiscovery2671669508/001
    logger.go:146: 2024-04-01T21:47:01.421-0700 INFO    starting server...      {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:01.421-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestClusterOf1UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery2671669508/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf1UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    logger.go:146: 2024-04-01T21:47:02.144-0700 INFO    started server. {"name": "TestClusterOf1UsingDiscovery-test-0", "pid": 438584}
    logger.go:146: 2024-04-01T21:47:02.145-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf1UsingDiscovery2671669508/002"], "working-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/001", "name": "TestClusterOf1UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
    logger.go:146: 2024-04-01T21:47:02.156-0700 INFO    closing test cluster...
    logger.go:146: 2024-04-01T21:47:02.157-0700 INFO    stopping server...      {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:02.170-0700 INFO    stopped server. {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:02.170-0700 INFO    closing server...       {"name": "TestClusterOf1UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:02.170-0700 INFO    removing directory      {"data-dir": "/tmp/TestClusterOf1UsingDiscovery2671669508/002"}
    logger.go:146: 2024-04-01T21:47:02.176-0700 INFO    closed test cluster.
--- FAIL: TestClusterOf1UsingDiscovery (0.76s)
=== RUN   TestClusterOf3UsingDiscovery
    before.go:36: Changing working directory to: /tmp/TestClusterOf3UsingDiscovery3135499303/001
    logger.go:146: 2024-04-01T21:47:02.177-0700 INFO    starting server...      {"name": "TestClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:02.177-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestClusterOf3UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestClusterOf3UsingDiscovery3135499303/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestClusterOf3UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/001", "name": "TestClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    logger.go:146: 2024-04-01T21:47:03.204-0700 INFO    started server. {"name": "TestClusterOf3UsingDiscovery-test-0", "pid": 438642}
    logger.go:146: 2024-04-01T21:47:03.204-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestClusterOf3UsingDiscovery3135499303/002"], "working-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/001", "name": "TestClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
    logger.go:146: 2024-04-01T21:47:03.216-0700 INFO    closing test cluster...
    logger.go:146: 2024-04-01T21:47:03.216-0700 INFO    stopping server...      {"name": "TestClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:03.229-0700 INFO    stopped server. {"name": "TestClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:03.229-0700 INFO    closing server...       {"name": "TestClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:03.229-0700 INFO    removing directory      {"data-dir": "/tmp/TestClusterOf3UsingDiscovery3135499303/002"}
    logger.go:146: 2024-04-01T21:47:03.235-0700 INFO    closed test cluster.
--- FAIL: TestClusterOf3UsingDiscovery (1.06s)
=== RUN   TestTLSClusterOf3UsingDiscovery
    before.go:36: Changing working directory to: /tmp/TestTLSClusterOf3UsingDiscovery294755773/001
    logger.go:146: 2024-04-01T21:47:03.236-0700 INFO    starting server...      {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:03.236-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "--name=TestTLSClusterOf3UsingDiscovery-test-0", "--listen-client-urls=http://localhost:2000", "--advertise-client-urls=http://localhost:2000", "--listen-peer-urls=http://localhost:2001", "--initial-advertise-peer-urls=http://localhost:2001", "--initial-cluster-token=new", "--data-dir", "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002", "--snapshot-count=10000", "--enable-v2", "--initial-cluster-token=new", "--initial-cluster=TestTLSClusterOf3UsingDiscovery-test-0=http://localhost:2001", "--initial-cluster-state=new"], "working-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/001", "name": "TestTLSClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    logger.go:146: 2024-04-01T21:47:04.162-0700 INFO    started server. {"name": "TestTLSClusterOf3UsingDiscovery-test-0", "pid": 438699}
    logger.go:146: 2024-04-01T21:47:04.162-0700 INFO    spawning process        {"args": ["/home/ivan/Code/Personal/etcd/etcd/bin/etcd-last-release", "grpc-proxy", "start", "--listen-addr", "localhost:2003", "--endpoints", "http://localhost:2000", "--advertise-client-url", "", "--data-dir", "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002"], "working-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/001", "name": "TestTLSClusterOf3UsingDiscovery-test-0", "environment-variables": ["PATH=/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/go/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.1/packages/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/installs/golang/1.22.0/packages/bin:/home/ivan/bin:/home/ivan/.local/share/asdf/shims:/opt/asdf-vm/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/opt/rocm/bin", "ETCD_UNSUPPORTED_ARCH=amd64", "ETCD_VERIFY=all"]}
    discovery_test.go:61: dial tcp [::1]:2002: connect: connection refused
    logger.go:146: 2024-04-01T21:47:04.173-0700 INFO    closing test cluster...
    logger.go:146: 2024-04-01T21:47:04.173-0700 INFO    stopping server...      {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:04.185-0700 INFO    stopped server. {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:04.185-0700 INFO    closing server...       {"name": "TestTLSClusterOf3UsingDiscovery-test-0"}
    logger.go:146: 2024-04-01T21:47:04.185-0700 INFO    removing directory      {"data-dir": "/tmp/TestTLSClusterOf3UsingDiscovery294755773/002"}
    logger.go:146: 2024-04-01T21:47:04.192-0700 INFO    closed test cluster.
--- FAIL: TestTLSClusterOf3UsingDiscovery (0.96s)
FAIL
FAIL    go.etcd.io/etcd/tests/v3/e2e    2.778s
FAIL

@jmhbnz, are we okay with ignoring e2e/discovery_test.go when the cluster_proxy tag is set? Do we want someone else to weigh in (Benjamin or Marek)?

ivanvc avatar Apr 02 '24 04:04 ivanvc

I think we can just add tag !cluster_proxy to disable its

fuweid avatar Apr 02 '24 04:04 fuweid

I think we can just add tag !cluster_proxy to disable

Agreed. Please feel free to raise a pr if you have time @ivanvc.

jmhbnz avatar Apr 02 '24 04:04 jmhbnz