scylla-operator [Flake] ScyllaCluster [It] should support scaling

https://github.com/scylladb/scylla-operator/runs/6098102031?check_suite_focus=true#step:12:828

2022-04-20T16:26:15.5863099Z   Unexpected error:
2022-04-20T16:26:15.5875913Z       <*errors.errorString | 0xc00029aea0>: {
2022-04-20T16:26:15.5907697Z           s: "timed out waiting for the condition",
2022-04-20T16:26:15.5908547Z       }
2022-04-20T16:26:15.5942141Z       timed out waiting for the condition
2022-04-20T16:26:15.5942777Z   occurred
2022-04-20T16:26:15.5988318Z   In [It] at: github.com/scylladb/scylla-operator/test/e2e/set/scyllacluster/scyllacluster_scaling.go:172

We've hit this flake on a PR branch (#973) introducing conntrack-cleaner DaemonSet, so bear in mind it might have been caused by removal of TCP conntrack entries, although at this point I'm quite confident it's unrelated.

Scylla Pod basic-q25mg-us-east-1-us-east-1a-1 fails with

2022-04-20T16:24:50.128071219Z ERROR 2022-04-20 16:24:50,127 [shard 0] init - Startup failed: std::runtime_error (This node was decommissioned and will not rejoin the ring unless override_decommission=true has been set,or all existing data is removed and the node is bootstrapped again)

Apr 22 '22 17:04 rzetelskik

What we've found so far (cc @zimnx):

An error logged by the operator during scaledown:

E0420 15:59:23.721881       1 scyllacluster/status.go:112] only last member of each rack should be decommissioning, but 2-th member of us-east-1a found decommissioning while rack had 2 members

Audit log:

jq -C 'select(.requestObject != null and .objectRef.namespace == "e2e-test-scyllacluster-frb8h-c5jdt" and (.objectRef.resource == "pods" or .objectRef.resource == "persistentvolumeclaims") and (.verb == "delete" or .verb == "create" or .verb == "patch")) | select(.responseObject | ( .metadata.namespace=="e2e-test-scyllacluster-frb8h-c5jdt") ) | "\(.requestReceivedTimestamp) \(.responseObject.kind) \(.responseObject.metadata.name) \(.verb)))"' kube-apiserver-audit.log |& less -R

"2022-04-20T15:49:00.665289Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-0 create))"
"2022-04-20T15:49:00.674049Z Pod basic-q25mg-us-east-1-us-east-1a-0 create))"
"2022-04-20T15:49:00.712274Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:49:04.227469Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:49:12.850698Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:49:30.428492Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:49:33.028337Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:49:53.831287Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:50:22.703295Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T15:50:55.930098Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-1 create))"
"2022-04-20T15:50:56.013056Z Pod basic-q25mg-us-east-1-us-east-1a-1 create))"
"2022-04-20T15:50:56.039772Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:52:39.141011Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:52:50.543184Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:53:01.143736Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:53:02.743794Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:53:36.369526Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:55:26.273638Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:55:26.307124Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-2 create))"
"2022-04-20T15:55:26.329697Z Pod basic-q25mg-us-east-1-us-east-1a-2 create))"
"2022-04-20T15:55:26.347700Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:55:36.302299Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:55:41.084628Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:55:53.729845Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:55:57.338199Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:56:16.515293Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:57:56.305048Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:58:26.845648Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:59:23.649979Z Pod basic-q25mg-us-east-1-us-east-1a-2 delete))"
"2022-04-20T15:59:26.549812Z Pod basic-q25mg-us-east-1-us-east-1a-2 patch))"
"2022-04-20T15:59:26.592306Z Pod basic-q25mg-us-east-1-us-east-1a-2 delete))"
"2022-04-20T15:59:33.736194Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-2 delete))"
"2022-04-20T15:59:43.942403Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:59:44.107274Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:59:45.861643Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:59:46.277592Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T15:59:50.074942Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:01:28.283630Z Pod basic-q25mg-us-east-1-us-east-1a-1 delete))"
"2022-04-20T16:01:30.573241Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:01:30.609214Z Pod basic-q25mg-us-east-1-us-east-1a-1 delete))"
"2022-04-20T16:01:38.655117Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-1 delete))"
"2022-04-20T16:01:49.041244Z PersistentVolumeClaim data-basic-q25mg-us-east-1-us-east-1a-1 create))"
"2022-04-20T16:01:49.065119Z Pod basic-q25mg-us-east-1-us-east-1a-1 create))"
"2022-04-20T16:01:49.105351Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:01:50.435651Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:01:52.506772Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:02:04.152209Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:02:05.219779Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:02:10.444144Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:25:03.036427Z Pod basic-q25mg-us-east-1-us-east-1a-0 patch))"
"2022-04-20T16:25:03.288236Z Pod basic-q25mg-us-east-1-us-east-1a-0 delete))"
"2022-04-20T16:25:07.127599Z Pod basic-q25mg-us-east-1-us-east-1a-1 patch))"
"2022-04-20T16:25:07.431384Z Pod basic-q25mg-us-east-1-us-east-1a-1 delete))"

storage-provisioner.yaml

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","integration-test":"storage-provisioner"},"name":"storage-provisioner","namespace":"kube-system"},"spec":{"containers":[{"command":["/storage-provisioner"],"image":"gcr.io/k8s-minikube/storage-provisioner:v5","imagePullPolicy":"IfNotPresent","name":"storage-provisioner","volumeMounts":[{"mountPath":"/tmp","name":"tmp"}]}],"hostNetwork":true,"serviceAccountName":"storage-provisioner","volumes":[{"hostPath":{"path":"/tmp","type":"Directory"},"name":"tmp"}]}}
  creationTimestamp: "2022-04-20T15:45:27Z"
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    integration-test: storage-provisioner
  name: storage-provisioner
  namespace: kube-system
  resourceVersion: "5463"
  uid: c58226ec-8255-484a-b0a4-edfc30ed2d26
spec:
  containers:
  - command:
    - /storage-provisioner
    image: gcr.io/k8s-minikube/storage-provisioner:v5
    imagePullPolicy: IfNotPresent
    name: storage-provisioner
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tmp
      name: tmp
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-2qf8x
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: fv-az180-796
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: storage-provisioner
  serviceAccountName: storage-provisioner
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - hostPath:
      path: /tmp
      type: Directory
    name: tmp
  - name: kube-api-access-2qf8x
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T15:45:27Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T15:55:15Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T15:55:15Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-04-20T15:45:27Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://0912254df14bfb68bce9892c4d12fddb741cf957ed6dcb1df25b665ad2762e25
    image: gcr.io/k8s-minikube/storage-provisioner:v5
    imageID: docker-pullable://gcr.io/k8s-minikube/storage-provisioner@sha256:18eb69d1418e854ad5a19e399310e52808a8321e4c441c1dddad8977a0d7a944
    lastState:
      terminated:
        containerID: docker://325016c30b91389a7f4a2f439ee367490cac7763ca11355268ed41cfb90bb8a3
        exitCode: 255
        finishedAt: "2022-04-20T15:53:49Z"
        reason: Error
        startedAt: "2022-04-20T15:52:05Z"
    name: storage-provisioner
    ready: true
    restartCount: 5
    started: true
    state:
      running:
        startedAt: "2022-04-20T15:55:13Z"
  hostIP: 10.1.0.135
  phase: Running
  podIP: 10.1.0.135
  podIPs:
  - ip: 10.1.0.135
  qosClass: BestEffort
  startTime: "2022-04-20T15:45:27Z"

storage-provisioner.logs

I0420 16:01:38.875727       1 controller.go:1478] delete "pvc-b11deda4-b78b-4ecc-ad84-ce7ccd6975a0": volume deletion ignored: ignored because identity annotation on PV does not match ours

It seems that after data-basic-q25mg-us-east-1-us-east-1a-1 PVC was deleted, storage-provisioner didn't delete the PV and hence storage wasn't cleared. After scaling up, data-basic-q25mg-us-east-1-us-east-1a-1 was successfully provisioned again. Possibly related: Deleted hostpath PVs stuck in released state after minikube restart #4546

Apr 22 '22 17:04 rzetelskik

If you look at the audit logs, can you dump the PV to see why the provissioner thinks differently? https://github.com/kubernetes/minikube/blob/f567893cdbb8e98a0348c1d93455af71cf6ce12c/pkg/storage/storage_provisioner.go#L97-L112

Apr 25 '22 09:04 tnozicka

yq '.items.[] | select(.spec.claimRef | (.namespace == "e2e-test-scyllacluster-frb8h-c5jdt" )) | [.]' persistentvolumes.yaml

- apiVersion: v1
  kind: PersistentVolume
  metadata:
    annotations:
      hostPathProvisionerIdentity: 32a4dd97-e96a-4482-8dc2-8d12dbecde78
      pv.kubernetes.io/provisioned-by: k8s.io/minikube-hostpath
    creationTimestamp: "2022-04-20T15:52:25Z"
    finalizers:
      - kubernetes.io/pv-protection
    name: pvc-b11deda4-b78b-4ecc-ad84-ce7ccd6975a0
    resourceVersion: "8188"
    uid: 174d7932-73cb-4be7-82ea-c080d8441768
  spec:
    accessModes:
      - ReadWriteOnce
    capacity:
      storage: 100Mi
    claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: data-basic-q25mg-us-east-1-us-east-1a-1
      namespace: e2e-test-scyllacluster-frb8h-c5jdt
      resourceVersion: "3197"
      uid: b11deda4-b78b-4ecc-ad84-ce7ccd6975a0
    hostPath:
      path: /tmp/hostpath-provisioner/e2e-test-scyllacluster-frb8h-c5jdt/data-basic-q25mg-us-east-1-us-east-1a-1
      type: ""
    persistentVolumeReclaimPolicy: Delete
    storageClassName: standard
    volumeMode: Filesystem
  status:
    phase: Released
- apiVersion: v1
  kind: PersistentVolume
  metadata:
    annotations:
      hostPathProvisionerIdentity: 69c3750d-ffae-41a3-99bf-94ec19737ca7
      pv.kubernetes.io/provisioned-by: k8s.io/minikube-hostpath
    creationTimestamp: "2022-04-20T15:49:00Z"
    finalizers:
      - kubernetes.io/pv-protection
    name: pvc-e18bc16e-4a9c-4c55-8755-7212a3a2fa15
    resourceVersion: "12566"
    uid: 13ce2d8e-2943-46c1-8f36-d9eecd8cb075
  spec:
    accessModes:
      - ReadWriteOnce
    capacity:
      storage: 100Mi
    claimRef:
      apiVersion: v1
      kind: PersistentVolumeClaim
      name: data-basic-q25mg-us-east-1-us-east-1a-0
      namespace: e2e-test-scyllacluster-frb8h-c5jdt
      resourceVersion: "1482"
      uid: e18bc16e-4a9c-4c55-8755-7212a3a2fa15
    hostPath:
      path: /tmp/hostpath-provisioner/e2e-test-scyllacluster-frb8h-c5jdt/data-basic-q25mg-us-east-1-us-east-1a-0
      type: ""
    persistentVolumeReclaimPolicy: Delete
    storageClassName: standard
    volumeMode: Filesystem
  status:
    phase: Released

Apr 25 '22 11:04 rzetelskik

seems like the provisioner uses random UUID that by definition won't work with a restart https://github.com/kubernetes/minikube/blob/f567893cdbb8e98a0348c1d93455af71cf6ce12c/pkg/storage/storage_provisioner.go#L52 looking at the logs, what was the cause a a very first restart of the controller?

Apr 25 '22 11:04 tnozicka

Discussed offline, but just for the record: we are not gathering logs from previous iterations of containers running in non-e2e namespaces, so best I could find was container statuses from the audit log.

  "containerStatuses": [
    {
      "name": "storage-provisioner",
      "state": {
        "running": {
          "startedAt": "2022-04-20T15:55:13Z"
        }
      },
      "lastState": {
        "terminated": {
          "exitCode": 255,
          "reason": "Error",
          "startedAt": "2022-04-20T15:52:05Z",
          "finishedAt": "2022-04-20T15:53:49Z",
          "containerID": "docker://325016c30b91389a7f4a2f439ee367490cac7763ca11355268ed41cfb90bb8a3"
        }
      },
      "ready": true,
      "restartCount": 5,
      "image": "gcr.io/k8s-minikube/storage-provisioner:v5",
      "imageID": "docker-pullable://gcr.io/k8s-minikube/storage-provisioner@sha256:18eb69d1418e854ad5a19e399310e52808a8321e4c441c1dddad8977a0d7a944",
      "containerID": "docker://0912254df14bfb68bce9892c4d12fddb741cf957ed6dcb1df25b665ad2762e25",
      "started": true
    }
  ],

I'm sending in a PR (#976) to gather logs from previous iterations and reproduce the issue.

Apr 27 '22 13:04 rzetelskik

Reproduced in Go #3252

logs_6239.zip e2e_artifacts.tar.lz4

I0428 09:35:16.761321       1 controller.go:1478] delete "pvc-2f1cc092-2e44-41be-a62f-60f0a0210548": volume deletion ignored: ignored because identity an
notation on PV does not match ours
I0428 09:35:24.860347       1 volume_store.go:219] persistentvolume "pvc-48d971c5-dcf6-405f-90ff-0267a1340f6c" saved
I0428 09:35:24.868034       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"e2e-test-scyllacluster-8z8qz-9dm9m", Name:"
data-basic-2ffbz-us-east-1-us-east-1a-2", UID:"48d971c5-dcf6-405f-90ff-0267a1340f6c", APIVersion:"v1", ResourceVersion:"5277", FieldPath:""}): type: 'Nor
mal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-48d971c5-dcf6-405f-90ff-0267a1340f6c
E0428 09:35:27.200117       1 leaderelection.go:361] Failed to update lock: context deadline exceeded
I0428 09:35:27.200341       1 leaderelection.go:278] failed to renew lease kube-system/k8s.io-minikube-hostpath: timed out waiting for the condition
F0428 09:35:27.204318       1 controller.go:913] leaderelection lost
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc00000e001, 0xc000428160, 0x45, 0xb0)
        /Users/medya/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1021 +0xb9
k8s.io/klog/v2.(*loggingT).output(0x220df80, 0xc000000003, 0x0, 0x0, 0xc000179b20, 0x1baa9bd, 0xd, 0x391, 0x0)
        /Users/medya/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:970 +0x191
k8s.io/klog/v2.(*loggingT).printf(0x220df80, 0x3, 0x0, 0x0, 0x0, 0x0, 0x1701df5, 0x13, 0x0, 0x0, ...)
        /Users/medya/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:751 +0x191
k8s.io/klog/v2.Fatalf(...)
        /Users/medya/go/pkg/mod/k8s.io/klog/[email protected]/klog.go:1509
sigs.k8s.io/sig-storage-lib-external-provisioner/v6/controller.(*ProvisionController).Run.func2()
        /Users/medya/go/pkg/mod/sigs.k8s.io/sig-storage-lib-external-provisioner/[email protected]/controller/controller.go:913 +0x8f
k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1(0xc000420480)
        /Users/medya/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:199 +0x29
k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc000420480, 0x18e5568, 0xc00031ed00)
        /Users/medya/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:209 +0x167
k8s.io/client-go/tools/leaderelection.RunOrDie(0x18e5568, 0xc000044018, 0x18ed938, 0xc0000c1400, 0x37e11d600, 0x2540be400, 0x77359400, 0xc000296840, 0x17a0ff8, 0x0, ...)
        /Users/medya/go/pkg/mod/k8s.io/[email protected]/tools/leaderelection/leaderelection.go:222 +0x9f
sigs.k8s.io/sig-storage-lib-external-provisioner/v6/controller.(*ProvisionController).Run(0xc0000c2f00, 0x18e5568, 0xc000044018)
        /Users/medya/go/pkg/mod/sigs.k8s.io/sig-storage-lib-external-provisioner/[email protected]/controller/controller.go:905 +0x49c
k8s.io/minikube/pkg/storage.StartStorageProvisioner(0x1707ad8, 0x19, 0x0, 0x0)
        /Users/medya/workspace/minikube/pkg/storage/storage_provisioner.go:142 +0x41d
main.main()
        /Users/medya/workspace/minikube/cmd/storage-provisioner/main.go:38 +0x146

storage-provisioner Pod is restarted as it looses leader election due to exceeding the time limit. minikube/storage-provisioner has leader election turned on by default (see: https://github.com/kubernetes/minikube/blob/dfd73aed384e10772f3a70970f487450d0630e74/pkg/storage/storage_provisioner.go#L139 and https://github.com/kubernetes-sigs/sig-storage-lib-external-provisioner/blob/5fd51cfade8c066f42fc33300c3fe700b0f67cae/controller/controller.go#L212) and I don't see a way to disable it atm.

Similar issue: https://github.com/kubernetes-retired/external-storage/issues/1184

@tnozicka any thoughts on this?

Apr 29 '22 10:04 rzetelskik

to sum it up here, this is waiting for the new scylla local storage provisioner https://github.com/scylladb/scylla-operator/pull/979#issuecomment-1145676235

Sep 20 '22 05:09 tnozicka

https://github.com/scylladb/scylla-operator/actions/runs/3393685823/jobs/5641361477#step:3:1649

Nov 04 '22 13:11 rzetelskik

to sum it up here, this is waiting for the new scylla local storage provisioner https://github.com/scylladb/scylla-operator/pull/979#issuecomment-1145676235

We have already switched to the local storage provisioner in https://github.com/scylladb/scylla-operator/pull/1106

https://github.com/scylladb/scylla-operator/blob/5b04e01ca7c5e3320067c4e6545fa74a0da79985/.github/actions/run-e2e/action.yaml#L62-L66

@rzetelskik can we close this?

May 09 '23 07:05 tnozicka

@tnozicka the original issue should be mitigated now but I believe I've seen this test being flaky quite recently, although most likely for a different reason. To keep the discussion clean I think we could close it anyway and open a separate issue or reopen this one if the e2e fails.

May 09 '23 07:05 rzetelskik

yep, let's open a new one for a different kind of failure

May 09 '23 07:05 tnozicka

scylla-operator scylla-operator copied to clipboard

[Flake] ScyllaCluster [It] should support scaling

scylla-operator
scylla-operator copied to clipboard