[test] TestBalancerUnderNetworkPartitionDelete failure
Raised on the latest code on main branch.
=== FAIL: integration/clientv3/connectivity TestBalancerUnderNetworkPartitionDelete (8.12s)
cluster.go:532: Creating listener with addr: 127.0.0.1:2109805817
......
logger.go:130: 2022-08-10T22:44:54.878Z WARN m1 request stats {"member": "m1", "start time": "2022-08-10T22:44:53.878Z", "time spent": "1.000107558s", "remote": "@", "response type": "/etcdserverpb.KV/DeleteRange", "request count": 0, "request size": 3, "response count": 0, "response size": 0, "request content": "key:\"a\" "}
logger.go:130: 2022-08-10T22:44:54.879Z WARN client retrying of unary invoker failed {"target": "etcd-endpoints://0xc000550f00/localhost:m0", "method": "/etcdserverpb.KV/DeleteRange", "attempt": 0, "error": "rpc error: code = Unavailable desc = etcdserver: request timed out, possibly due to previous leader failure"}
network_partition_test.go:137: Op returned error: etcdserver: request timed out, possibly due to previous leader failure
network_partition_test.go:138: Cancelling...
network_partition_test.go:144: #0: expected 'expected error', got 'etcdserver: request timed out, possibly due to previous leader failure'
logger.go:130: 2022-08-10T22:44:54.993Z INFO m0.raft c16a4db1d4d2aea3 is starting a new election at term 8 {"member": "m0"}
......
logger.go:130: 2022-08-10T22:44:56.922Z INFO m0.raft c16a4db1d4d2aea3 became candidate at term 21 {"member": "m0"}
Saved JUnit XML test report to /home/runner/work/etcd/etcd/linux-amd64-integration-4-cpu/junit_MTY2MDE3MTMxMwo.xml
FAIL: 'integration' failed at Wed Aug 10 22:48:23 UTC 2022
......
logger.go:130: 2022-08-10T22:44:59.879Z DEBUG client retrying of unary invoker {"target": "etcd-endpoints://0xc000550f00/localhost:m0", "method": "/etcdserverpb.KV/DeleteRange", "attempt": 0}
network_partition_test.go:137: Op returned error: <nil>
network_partition_test.go:138: Cancelling...
logger.go:130: 2022-08-10T22:44:59.880Z INFO grpc [[core] [Channel #329] Channel Connectivity change to SHUTDOWN]
......
logger.go:130: 2022-08-10T22:44:59.889Z INFO m1 terminated a member {"member": "m1", "name": "m1", "advertise-peer-urls": ["unix://127.0.0.1:2110105817"], "listen-client-urls": ["unix://127.0.0.1:2110205817"], "grpc-url": "unix://localhost:m1"}
cluster.go:1392: ========= Cluster termination succeeded ===================
DONE 519 tests, 2 skipped, 1 failure in 1.199s
Error: Process completed with exit code 255.
Refer to https://github.com/etcd-io/etcd/runs/7777195728?check_suite_focus=true
after digging into this a bit, I assume this is due to:
2022-08-10T22:48:23.0647849Z network_partition_test.go:144: #0: expected 'expected error', got 'etcdserver: request timed out, possibly due to previous leader failure'
So in: https://github.com/etcd-io/etcd/blob/a1fb9ff1e4de40337735d07ca0773cfc242ad00f/tests/integration/clientv3/connectivity/network_partition_test.go#L52-L55
This isn't caught by the IsClientTimeout. Shall this error be added in there as a transient error or does this indicate another issue?
It seems that the issue has been fixed by #14377