scylla-operator icon indicating copy to clipboard operation
scylla-operator copied to clipboard

[Flake] ScyllaCluster upgrades should deploy and update [It] with 3 member(s) and 1 rack(s) from 4.4.6 to 4.5.1

Open tnozicka opened this issue 3 years ago • 4 comments

https://github.com/scylladb/scylla-operator/runs/4521386790?check_suite_focus=true#step:12:589

    STEP: Collecting dumps from namespace "e2e-test-scyllacluster-tz9cz-89r6b". 12/14/21 14:44:02.417
    STEP: Destroying namespace "e2e-test-scyllacluster-tz9cz-89r6b". 12/14/21 14:44:02.845
    STEP: Waiting for namespace "e2e-test-scyllacluster-tz9cz-89r6b" to be removed. 12/14/21 14:44:02.853
  << End Captured GinkgoWriter Output

  Unexpected error:
      <*fmt.wrapError | 0xc00028f880>: {
          msg: "can't select data: Cannot achieve consistency level for cl ALL. Requires 3, alive 2",
          err: <*gocql.RequestErrUnavailable | 0xc0004ae620>{
              errorFrame: {
                  frameHeader: {version: 132, flags: 0, stream: 192, op: 0, length: 80, warnings: nil},
                  code: 4096,
                  message: "Cannot achieve consistency level for cl ALL. Requires 3, alive 2",
              },
              Consistency: 5,
              Required: 3,
              Alive: 2,
          },
      }
      can't select data: Cannot achieve consistency level for cl ALL. Requires 3, alive 2
  occurred
  In [It] at: github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/verify.go:131

  Full Stack Trace
    github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster.verifyScyllaCluster({0x1b4d148, 0xc0004be0c0}, {0x1b9b1f0, 0xc0000b2b00}, 0xc0005da500, 0xc000285270)
    	github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/verify.go:131 +0x719
    github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster.glob..func8.2(0xc00002f320)
    	github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_upgrades.go:100 +0xd14
    reflect.Value.call({0x16247e0, 0xc0000863a0, 0x13}, {0x18f6ba7, 0x4}, {0xc00002a048, 0x1, 0x1})
    	reflect/value.go:543 +0x814
    reflect.Value.Call({0x16247e0, 0xc0000863a0, 0x1b96040}, {0xc00002a048, 0x1, 0x1})
    	reflect/value.go:339 +0xc5

all the pods we up before this check and still were after it failed

      lastTransitionTime: "2021-12-14T14:43:52Z"
      status: "True"
      type: Ready
      lastTransitionTime: "2021-12-14T14:42:42Z"
      status: "True"
      type: Ready
      lastTransitionTime: "2021-12-14T14:41:36Z"
      status: "True"
      type: Ready

e2e-artifacts.tar.lz4.zip

tnozicka avatar Dec 14 '21 15:12 tnozicka

note that the default failure threshold is 3 (need to fix that) so you need to double check the events if readiness started failing

tnozicka avatar Dec 15 '21 06:12 tnozicka

also this one looks similar https://github.com/scylladb/scylla-operator/runs/4522166605?check_suite_focus=true#step:12:668 e2e-artifacts.tar.lz4.zip

    STEP: Waiting for namespace "e2e-test-scyllacluster-v7crr-bc2dk" to be removed. 12/14/21 15:42:19.796
  << End Captured GinkgoWriter Output

  Expected
      <[]string | len:2, cap:2>: ["10.107.7.63", "10.96.215.156"]
  to be empty
  In [It] at: github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/verify.go:126

  Full Stack Trace
    github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster.verifyScyllaCluster({0x1b4d148, 0xc0002c6600}, {0x1b9b1f0, 0xc000459080}, 0xc000236c80, 0xc0007b6410)
    	github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/verify.go:126 +0x62c
    github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster.glob..func8.2(0xc00055d0e0)
    	github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_upgrades.go:100 +0xd14
    reflect.Value.call({0x16247e0, 0xc000559340, 0x13}, {0x18f6ba7, 0x4}, {0xc0002ec258, 0x1, 0x1})
    	reflect/value.go:543 +0x814
    reflect.Value.Call({0x16247e0, 0xc000559340, 0x1b96040}, {0xc0002ec258, 0x1, 0x1})
    	reflect/value.go:339 +0xc5

tnozicka avatar Dec 15 '21 06:12 tnozicka

@zimnx Should we close this issue? I believe it was fixed with #901 .

rzetelskik avatar Jan 05 '22 08:01 rzetelskik

Not sure, still upgrade test is flaky on PRs rebased on master: https://github.com/scylladb/scylla-operator/runs/4704484346?check_suite_focus=true https://github.com/scylladb/scylla-operator/actions/runs/1653156454/attempts/1 I need to analyze the dump

zimnx avatar Jan 05 '22 10:01 zimnx

Should've been fixed by https://github.com/scylladb/scylla-operator/issues/971. Any other issues related to (minor) delays in gossip propagation should be covered by https://github.com/scylladb/scylla-operator/pull/1054. @tnozicka should we close it?

rzetelskik avatar Oct 19 '22 12:10 rzetelskik

+1

tnozicka avatar Oct 20 '22 08:10 tnozicka