kustomize-controller icon indicating copy to clipboard operation
kustomize-controller copied to clipboard

Kubernetes 1.22.9: update forbidden fields of a StatefulSet

Open chanwit opened this issue 2 years ago • 7 comments

Flux version v0.31.1

❯ flux check
► checking prerequisites
✔ Kubernetes 1.22.9-eks-a64ea69 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.22.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.26.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.24.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.25.5
✔ all checks passed

Error message

StatefulSet/ww-gitops/web dry-run failed, reason: Invalid, error: StatefulSet.apps "web" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden

How to reproduce

Here's a source and a KS.

---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
  name: test-flux
  namespace: ww-gitops
spec:
  interval: 1m
  url: https://github.com/openshift-fluxv2-poc/test-flux
  ref:
    branch: wego
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: test-flux
  namespace: ww-gitops
spec:
  sourceRef:
    kind: GitRepository
    name: test-flux
  targetNamespace: ww-gitops
  path: ./
  interval: 1m
  prune: false
  wait: true
  retryInterval: 1m30s
  timeout: 3m

The following is a StatefulSet with all managed fields.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: "2022-06-09T16:32:10Z"
  generation: 1
  labels:
    kustomize.toolkit.fluxcd.io/name: test-flux
    kustomize.toolkit.fluxcd.io/namespace: ww-gitops
  managedFields:
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          f:kustomize.toolkit.fluxcd.io/name: {}
          f:kustomize.toolkit.fluxcd.io/namespace: {}
      f:spec:
        f:minReadySeconds: {}
        f:replicas: {}
        f:selector: {}
        f:serviceName: {}
        f:template:
          f:metadata:
            f:creationTimestamp: {}
            f:labels:
              f:app: {}
          f:spec:
            f:containers:
              k:{"name":"nginx"}:
                .: {}
                f:image: {}
                f:name: {}
                f:ports:
                  k:{"containerPort":80,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                f:resources: {}
                f:volumeMounts:
                  k:{"mountPath":"/usr/share/nginx/html"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
            f:terminationGracePeriodSeconds: {}
        f:updateStrategy: {}
        f:volumeClaimTemplates: {}
    manager: kustomize-controller
    operation: Apply
    time: "2022-06-09T16:32:10Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:availableReplicas: {}
        f:collisionCount: {}
        f:currentReplicas: {}
        f:currentRevision: {}
        f:observedGeneration: {}
        f:readyReplicas: {}
        f:replicas: {}
        f:updateRevision: {}
        f:updatedReplicas: {}
    manager: kube-controller-manager
    operation: Update
    subresource: status
    time: "2022-06-09T16:32:20Z"
  name: web
  namespace: ww-gitops
  resourceVersion: "5688059"
  uid: 7f24ab97-9306-4105-b4fd-0df8cfd4db4f
spec:
  podManagementPolicy: OrderedReady
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx
  serviceName: nginx
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: nginx
    spec:
      containers:
      - image: gcr.io/google-containers/nginx-slim-amd64:0.27
        imagePullPolicy: IfNotPresent
        name: nginx
        ports:
        - containerPort: 80
          name: web
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/share/nginx/html
          name: www
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 10
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: www
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  availableReplicas: 3
  collisionCount: 0
  currentReplicas: 3
  currentRevision: web-55c74fc88f
  observedGeneration: 1
  readyReplicas: 3
  replicas: 3
  updateRevision: web-55c74fc88f
  updatedReplicas: 3

chanwit avatar Jun 09 '22 16:06 chanwit

I can't reproduce this on Kubernetes 1.23 (GKE) nor on Kubernetes 1.24 (Kind). Must be some bug in Kubernetes 1.22...

stefanprodan avatar Jun 09 '22 17:06 stefanprodan

Created a clean EKS cluster, re-tested and the problem still persisted. There's no EKS 1.23.x at the moment so need to wait for them to release.

chanwit avatar Jun 10 '22 11:06 chanwit

Can you replicate this with Kubernetes Kind?

stefanprodan avatar Jun 10 '22 12:06 stefanprodan

Tested on KinD 1.22.9

❯ flux check
► checking prerequisites
✔ Kubernetes 1.22.9 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.22.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.26.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.24.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.25.5
✔ all checks passed

☸ kind-kind in …/test-flux
❯ kubectl get ks -n ww-gitops -A
NAMESPACE   NAME        AGE    READY   STATUS
ww-gitops   test-flux   2m5s   False   StatefulSet/ww-gitops/web dry-run failed, reason: Invalid, error: StatefulSet.apps "web" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden

chanwit avatar Jun 10 '22 15:06 chanwit

Everything looks good when testing with K8s 1.23

❯ flux check
► checking prerequisites
✔ Kubernetes 1.23.6 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.22.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.26.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.24.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.25.5
✔ all checks passed

☸ kind-kind in …/test-flux
❯ kubectl get ks -n ww-gitops -A
NAMESPACE   NAME        AGE     READY   STATUS
ww-gitops   test-flux   2m23s   True    Applied revision: wego/554420c11aec8671f06172fc148431b80d1db394

chanwit avatar Jun 10 '22 16:06 chanwit

@chanwit is this only happening on this specific k8s minor version or does that also happen with 1.21?

pjbgf avatar Jun 15 '22 12:06 pjbgf

@pjbgf I haven't got a chance to test with 1.21 yet.

chanwit avatar Jun 15 '22 12:06 chanwit

I'm seeing the same error on EKS 1.22

d1rtym0nk3y avatar Oct 27 '22 14:10 d1rtym0nk3y

Kubernetes 1.22 becomes EOL Today. Above we have reports that this could not be reproduceable on k8s 1.23 and 1.24, so I will leave this marked as "bug" until we confirm that this can't be reproduced on 1.25 as well.

pjbgf avatar Oct 28 '22 11:10 pjbgf