flux2
flux2 copied to clipboard
Deployment has more pod replicas when 'flux reconcile kustomization apps' is used
Describe the bug
I use flux2 on for gitOps on aks and gke.
flux-system has apps and infra kustomization yaml files each. ( apps.yaml, infrastructire.yaml) ( apiVersion: kustomize.toolkit.fluxcd.io/v1beta2)
Cluster apps.yaml point to an overlay of some base apps. One app used Deployment manifest and has 3 pods on each cluster.
- On using cli 'flux reconcile kustomization apps' i end up having 3 more pods for the same deployment.
- This i only see on gke and not on aks.
- Also you cant delete the prior pods of the deployment once i ran this command. They recreate.
- This happens in gke. And not in aks.
- There is no error but it is obvious i was expecting no new pods for that deployment in any cluster no matter.
Steps to reproduce
$ flux reconcile kustomization apps ► annotating Kustomization apps in flux-system namespace ✔ Kustomization annotated ◎ waiting for Kustomization reconciliation ✔ applied revision xxx/xxx/169effe34ab12a75f26eb2e0ac87688e8fb6fed9
$ flux version flux: v0.30.2 helm-controller: v0.17.1 image-automation-controller: v0.20.1 image-reflector-controller: v0.16.0 kustomize-controller: v0.21.1 notification-controller: v0.22.2 source-controller: v0.21.2
Expected behavior
no new addition of pods to existing pods
Screenshots and recordings
No response
OS / Distro
ubuntu 20.04 wsl able to connect to aks and gke via config
Flux version
v0.30.2
Flux check
$ flux check ► checking prerequisites ✗ flux 0.30.2 <0.31.3 (new version is available, please upgrade) ✔ Kubernetes 1.22.8-gke.201 >=1.20.6-0 ► checking controllers ✔ helm-controller: deployment ready ► ghcr.io/fluxcd/helm-controller:v0.21.0 ✔ image-automation-controller: deployment ready ► ghcr.io/fluxcd/image-automation-controller:v0.22.1 ✔ image-reflector-controller: deployment ready ► ghcr.io/fluxcd/image-reflector-controller:v0.18.0 ✔ kustomize-controller: deployment ready ► ghcr.io/fluxcd/kustomize-controller:v0.25.0 ✔ notification-controller: deployment ready ► ghcr.io/fluxcd/notification-controller:v0.23.5 ✔ source-controller: deployment ready ► ghcr.io/fluxcd/source-controller:v0.24.4 ✔ all checks passed
Git provider
bitbucket.org
Container Registry provider
No response
Additional context
kubectl context for gke. $ kubectl version WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full versi on. Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate: "2022-05-03T13:46:05Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.8-gke.201", GitCommit:"2dca91e5224568a093c27d3589aa0a96fd3ddc9a", GitTreeState:"clean", Bu ildDate:"2022-05-11T18:39:02Z", GoVersion:"go1.16.14b7", Compiler:"gc", Platform:"linux/amd64"} WARNING: version difference between client (1.24) and server (1.22) exceeds the supported minor version skew of +/-1
Aks context $ kubectl version WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full versi on. Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate: "2022-05-03T13:46:05Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"} Kustomize Version: v4.5.4 Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"3c28d5b50e68d9c6da51840bbdebf6bd0673dde5", GitTreeState:"clean", BuildDate: "2022-06-04T17:59:02Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"} WARNING: version difference between client (1.24) and server (1.21) exceeds the supported minor version skew of +/-1
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
If you run flux diff for that Kustomization against the GKE cluster, do you see any changes to the replicas field inside the deployment spec?
no. I could not see that change in replicas for the Deployment spec in the flux diff for that kustomization against the GKE cluster.
⚠️ identified at least one change, exiting with non-zero exit code
But it did not show on deployment spec of these apps
$ kubectl get deployments -n graph
NAME READY UP-TO-DATE AVAILABLE AGE
agb 1/1 1 1 99s
age 1/1 1 1 99s
agv 1/1 1 1 99s
$ kubectl get pods -n graph
NAME READY STATUS RESTARTS AGE
agb-6594984777-w4dbp 1/1 Running 0 49m
agb-759dc4787d-h48n8 1/1 Running 0 28m
age-55447bdcfc-klpnk 1/1 Running 1 (28m ago) 28m
age-67bb97c88c-49896 1/1 Running 1 (49m ago) 49m
agv-597c44dc5-mt9k5 1/1 Running 0 19h
agv-bbf5bf7-b2vgt 1/1 Running 0 28m
this is when i ran
$ flux reconcile kustomization apps
► annotating Kustomization apps in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
Ok so it's not Flux that's changing the replicas, do you have HPA setup for this deployment?
no. no HPA. sharing one of the deployment manifests. Do you suspect a problem with the deployment manifest ?
apiVersion: apps/v1
kind: Deployment
metadata:
name: agv
namespace: graph
labels:
app.kubernetes.io/name: agv
spec:
selector:
matchLabels:
app.kubernetes.io/name: agv
replicas: 1
template:
metadata:
labels:
app.kubernetes.io/name: agv
spec:
restartPolicy: Always
serviceAccountName: graph-sa
containers:
- name: agv
image: "bitnine/agviewer:latest"
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
ports:
- name: agv-port
containerPort: 3001
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
volumeMounts:
- name: agv-db-disk
mountPath: /disc
volumes:
- name: agv-db-disk
emptyDir: {}
if kustomize build . | kubectl apply -f - does not lead to extra pod creation then why would flux reconcile kustomization apps result in the extra pods ?
This may have to do with server-side apply, Flux does not uses kubectl. Please post here the deployment with kubectl get --show-managed-fields -oyaml.
here it is
$ kubectl -n graph get deployment/agv --show-managed-fields -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2022-06-30T15:00:00Z"
generation: 1
labels:
app.kubernetes.io/name: agv
kustomize.toolkit.fluxcd.io/name: apps
kustomize.toolkit.fluxcd.io/namespace: flux-system
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:replicas: {}
f:selector: {}
f:strategy: {}
f:template:
f:metadata:
f:creationTimestamp: {}
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"agv"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":3001,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
f:securityContext:
f:capabilities:
f:add: {}
f:drop: {}
f:volumeMounts:
k:{"mountPath":"/disc"}:
.: {}
f:mountPath: {}
f:name: {}
f:restartPolicy: {}
f:serviceAccountName: {}
f:volumes:
k:{"name":"agv-db-disk"}:
.: {}
f:emptyDir: {}
f:name: {}
manager: kustomize-controller
operation: Apply
time: "2022-06-30T15:00:00Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
manager: kube-controller-manager
operation: Update
subresource: status
time: "2022-06-30T15:00:14Z"
name: agviewer
namespace: agensgraph
resourceVersion: "203200426"
uid: 4f27231c-1447-45d3-8109-17dee9810d75
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/name: agv
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: agv
spec:
containers:
- image: bitnine/agviewer:latest
imagePullPolicy: IfNotPresent
name: agv
ports:
- containerPort: 3001
name: agv-port
protocol: TCP
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /disc
name: agv-db-disk
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: graph-sa
serviceAccountName: graph-sa
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: agv-db-disk
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2022-06-30T15:00:14Z"
lastUpdateTime: "2022-06-30T15:00:14Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2022-06-30T15:00:00Z"
lastUpdateTime: "2022-06-30T15:00:14Z"
message: ReplicaSet "agviewer-bbf5bf7" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1
is this what was required and what do you see made extra pods ?
Can you please post the same deployment YAML after an flux reconcile when it adds more replicas.
sorry that i cant give you full names at places and used xxx in them
$ flux reconcile kustomization apps
► annotating Kustomization apps in flux-system namespace
✔ Kustomization annotated-06-30T15:00:14Z"
◎ waiting for Kustomization reconciliationbility.
✔ applied revision xxx/xxx/864bd4dd133ae597372b8adfcdf19634c38b5d41
$ date
Thu Jun 30 06:30:43 CEST 2022
$ kubectl get pods -n agensgraph
NAME READY STATUS RESTARTS AGEsed.
agbxxx-6594984777-lwvk2 1/1 Running 0 14h
agbxxx-759dc4787d-xtvbs 1/1 Running 0 14h
agexxx-55447bdcfc-5qrsm 1/1 Running 1 (3h59m ago) 14h
agexxx-67bb97c88c-2x4l5 1/1 Running 1 (3h57m ago) 14h
agvxxx-597c44dc5-dwgwt 1/1 Running 0 14h
agvxxx-bbf5bf7-fqd8v 1/1 Running 0 14h
one deployment extract from lens UI and some modification with xxx where it requires.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: agvxxx
namespace: xxxgraph
uid: 4f27231c-1447-45d3-8109-17dee9810d75
resourceVersion: '203656345'
generation: 1
creationTimestamp: '2022-06-30T15:00:00Z'
labels:
app.kubernetes.io/name: agvxxx
kustomize.toolkit.fluxcd.io/name: apps
kustomize.toolkit.fluxcd.io/namespace: flux-system
annotations:
deployment.kubernetes.io/revision: '1'
managedFields:
- manager: kustomize-controller
operation: Apply
apiVersion: apps/v1
time: '2022-06-30T15:00:00Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:replicas: {}
f:selector: {}
f:strategy: {}
f:template:
f:metadata:
f:creationTimestamp: {}
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"agvxxx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":3001,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
f:securityContext:
f:capabilities:
f:add: {}
f:drop: {}
f:volumeMounts:
k:{"mountPath":"/disc"}:
.: {}
f:mountPath: {}
f:name: {}
f:restartPolicy: {}
f:serviceAccountName: {}
f:volumes:
k:{"name":"agvxxx-db-disk"}:
.: {}
f:emptyDir: {}
f:name: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-07-01T06:03:27Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
subresource: status
selfLink: /apis/apps/v1/namespaces/xxxgraph/deployments/agvxxx
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2022-06-30T15:00:14Z'
lastTransitionTime: '2022-06-30T15:00:00Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "agvxxx-bbf5bf7" has successfully progressed.
- type: Available
status: 'True'
lastUpdateTime: '2022-07-01T06:03:27Z'
lastTransitionTime: '2022-07-01T06:03:27Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: agvxxx
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: agvxxx
spec:
volumes:
- name: agvxxx-db-disk
emptyDir: {}
containers:
- name: agvxxx
image: bitnine/agviewer:latest
ports:
- name: agvxxx-port
containerPort: 3001
protocol: TCP
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
volumeMounts:
- name: agvxxx-db-disk
mountPath: /disc
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: xxxgraph-sa
serviceAccount: xxxgraph-sa
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
sorry that i cant give you full names at places and used xxx in them I applied the reconcile for the flux-system apps this time it did not add pods.
$ flux reconcile kustomization apps
► annotating Kustomization apps in flux-system namespace
✔ Kustomization annotated-06-30T15:00:14Z"
◎ waiting for Kustomization reconciliationbility.
✔ applied revision xxxx/xxxx/864bd4dd133ae597372b8adfcdf19634c38b5d41
$ date
Thu Jun 30 06:30:43 CEST 2022
$ kubectl get pods -n xxxgraph
NAME READY STATUS RESTARTS AGEsed.
agbxxx-6594984777-lwvk2 1/1 Running 0 14h
agbxxx-759dc4787d-xtvbs 1/1 Running 0 14h
agexxx-55447bdcfc-5qrsm 1/1 Running 1 (3h59m ago) 14h
agexxx-67bb97c88c-2x4l5 1/1 Running 1 (3h57m ago) 14h
agvxxx-597c44dc5-dwgwt 1/1 Running 0 14h
agvxxx-bbf5bf7-fqd8v 1/1 Running 0 14h
extract from lens and some modification with xxx where it requires.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: agvxxx
namespace: xxxgraph
uid: 4f27231c-1447-45d3-8109-17dee9810d75
resourceVersion: '203656345'
generation: 1
creationTimestamp: '2022-06-30T15:00:00Z'
labels:
app.kubernetes.io/name: agvxxx
kustomize.toolkit.fluxcd.io/name: apps
kustomize.toolkit.fluxcd.io/namespace: flux-system
annotations:
deployment.kubernetes.io/revision: '1'
managedFields:
- manager: kustomize-controller
operation: Apply
apiVersion: apps/v1
time: '2022-06-30T15:00:00Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:replicas: {}
f:selector: {}
f:strategy: {}
f:template:
f:metadata:
f:creationTimestamp: {}
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"agvxxx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":3001,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
f:securityContext:
f:capabilities:
f:add: {}
f:drop: {}
f:volumeMounts:
k:{"mountPath":"/disc"}:
.: {}
f:mountPath: {}
f:name: {}
f:restartPolicy: {}
f:serviceAccountName: {}
f:volumes:
k:{"name":"agvxxx-db-disk"}:
.: {}
f:emptyDir: {}
f:name: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-07-01T06:03:27Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
subresource: status
selfLink: /apis/apps/v1/namespaces/xxxgraph/deployments/agvxxx
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2022-06-30T15:00:14Z'
lastTransitionTime: '2022-06-30T15:00:00Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "agvxxx-bbf5bf7" has successfully progressed.
- type: Available
status: 'True'
lastUpdateTime: '2022-07-01T06:03:27Z'
lastTransitionTime: '2022-07-01T06:03:27Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: agvxxx
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: agvxxx
spec:
volumes:
- name: agvxxx-db-disk
emptyDir: {}
containers:
- name: agvxxx
image: bitnine/agviewer:latest
ports:
- name: agvxxx-port
containerPort: 3001
protocol: TCP
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
volumeMounts:
- name: agvxxx-db-disk
mountPath: /disc
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: xxxgraph-sa
serviceAccount: xxxgraph-sa
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
Based on what Kubernetes says, it can't create 3 replicas since the status is:
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
I guess this is a Kubernetes bug when a deployment is applied using server-side apply. I think you should be able to reproduce the bug with kubectl apply --server-side --field-manager=kustomize-controller.
no pods added.
$ kubectl apply --server-side --field-manager=kustomize-controller -f agvxxx.Deployment.yaml
deployment.apps/agvxxx serverside-applied
$ kubectl get pods -n xxxgraph
NAME READY STATUS RESTARTS AGE
agbxxx-6594984777-lwvk2 1/1 Running 0 14h
agbxxx-759dc4787d-xtvbs 1/1 Running 0 14h
agexxx-55447bdcfc-5qrsm 1/1 Running 1 (4h48m ago) 14h
agexxx-67bb97c88c-2x4l5 1/1 Running 1 (4h45m ago) 14h
agvxxx-597c44dc5-dwgwt 1/1 Running 0 14h
agvxxx-bbf5bf7-fqd8v 1/1 Running 0 14h
lens UI extract
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: agvxxx
namespace: xxxgraph
uid: 4f27231c-1447-45d3-8109-17dee9810d75
resourceVersion: '203865002'
generation: 1
creationTimestamp: '2022-06-30T15:00:00Z'
labels:
app.kubernetes.io/name: agvxxx
annotations:
deployment.kubernetes.io/revision: '1'
managedFields:
- manager: kustomize-controller
operation: Apply
apiVersion: apps/v1
time: '2022-07-01T10:49:30Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:replicas: {}
f:selector: {}
f:template:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"agvxxx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":3001,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
f:securityContext:
f:capabilities:
f:add: {}
f:drop: {}
f:volumeMounts:
k:{"mountPath":"/disc"}:
.: {}
f:mountPath: {}
f:name: {}
f:restartPolicy: {}
f:serviceAccountName: {}
f:volumes:
k:{"name":"agvxxx-db-disk"}:
.: {}
f:emptyDir: {}
f:name: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-07-01T06:03:27Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
subresource: status
selfLink: /apis/apps/v1/namespaces/xxxgraph/deployments/agvxxx
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Progressing
status: 'True'
lastUpdateTime: '2022-06-30T15:00:14Z'
lastTransitionTime: '2022-06-30T15:00:00Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "agvxxx-bbf5bf7" has successfully progressed.
- type: Available
status: 'True'
lastUpdateTime: '2022-07-01T06:03:27Z'
lastTransitionTime: '2022-07-01T06:03:27Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: agvxxx
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: agvxxx
spec:
volumes:
- name: agvxxx-db-disk
emptyDir: {}
containers:
- name: agvxxx
image: bitnine/agviewer:latest
ports:
- name: agvxxx-port
containerPort: 3001
protocol: TCP
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
volumeMounts:
- name: agvxxx-db-disk
mountPath: /disc
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: xxxgraph-sa
serviceAccount: xxxgraph-sa
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
the problem is 1st time only when i run flux reconcile kustomization apps and not later..
the problem is 1st time only when i run flux reconcile kustomization apps and not later..
Ok this may be due to having run kubectl on that deployment after Flux applied it, Flux removes all the kubectl edits from the cluster, so if you've run kubectl rollout restart Flux will remove the annotation and Kubernetes will restart de deployment.
but i only use Flux2 for app releases from the git into the cluster. There is no kubectl involvement. Its only this time that you asked to use kubectl apply --server-side --field-manager=kustomize-controller -f agvxxx.Deployment.yaml otherwise its flux.
since the flux takes time to release manifests into the cluster to speed it up i use flux reconcile kustomization apps (1st time or only 1 time)
Ok I can't explain then this bug, to dig into it we need to reproduce this behaviour with Kubernetes Kind in CI.
Ok. Should i have to do it with a kind cluster? If so, when you mention CI, should i use bitbucket or is it the CI for flux2 repo?
the problem is i cant change my cluster at this moment with the flux2 working and so many apps deployed. But i am wondering if there can there can be a flag to the command flux reconcile kustomization apps that can stop creation of pods for a deployment.
The command does't do anything to deployments, it only adds an annotation to the Flux Kustomization, so that kustomize-controller sees a change and reruns the server-side apply of whatever is in Git. Flux has no knowledge about deployments, it's just some YAML.
now am really scared to use the command because i cant even delete the old pods as get recreated on deletion. For a single-replica deployment now there are 2 replicasets and 2 pods against it just by running that command. when you mention - kustomize-controller sees a change and reruns the server-side apply of whatever is in Git, is it possible to do a joint test via slack and figure this out on a kind or a gke?
Can you please try something, delete the replicas: 1 from Git and also delete the deployment from the cluster. Then let Flux recreate it and rerun flux reconcile to see if anything changes.
is it possible to do a joint test via slack and figure this out on a kind or a gke?
Yes we can do this next week on CNCF Slack.
I followed the steps.
- delete the replicas: 1 from Git and
- also delete the deployment from the cluster.
- Then let Flux recreate it
- waited for pods
pod situation after that
$ kubectl get pods -n xxxgraph
NAME READY STATUS RESTARTS AGE
agbxxx-6594984777-lwvk2 1/1 Running 0 16h
agbxxx-759dc4787d-xtvbs 1/1 Running 0 16h
agexxx-55447bdcfc-5qrsm 1/1 Running 1 (6h48m ago) 16h
agexxx-67bb97c88c-2x4l5 1/1 Running 1 (6h45m ago) 16h
agvxxx-597c44dc5-dwgwt 1/1 Running 0 16h
agvxxx-687c4b5dbd-qt492 1/1 Running 0 33s
agvxxx-bbf5bf7-fqd8v 1/1 Running 0 16h
in the git
apiVersion: apps/v1
kind: Deployment
metadata:
name: agvxxx
namespace: xxxgraph
labels:
app.kubernetes.io/name: agvxxx
spec:
selector:
matchLabels:
app.kubernetes.io/name: agvxxx
# replicas: 1
template:
metadata:
labels:
app.kubernetes.io/name: agvxxx
then i ran
$ flux reconcile kustomization apps
► annotating Kustomization apps in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
✔ applied revision xxx/xxx/49e86e11f03197f3d4403367d313928d3ad9fce0
pod situation no changes than whats seen above.
apiVersion: apps/v1
kind: Deployment
metadata:
name: agvxxx
namespace: xxxgraph
uid: 8aa108ed-4e25-4cb2-a063-311f09b9f660
resourceVersion: '203954199'
generation: 1
creationTimestamp: '2022-07-01T12:49:22Z'
labels:
app.kubernetes.io/name: agvxxx
kustomize.toolkit.fluxcd.io/name: apps
kustomize.toolkit.fluxcd.io/namespace: flux-system
annotations:
deployment.kubernetes.io/revision: '1'
managedFields:
- manager: kustomize-controller
operation: Apply
apiVersion: apps/v1
time: '2022-07-01T12:49:22Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:app.kubernetes.io/name: {}
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:selector: {}
f:strategy: {}
f:template:
f:metadata:
f:creationTimestamp: {}
f:labels:
f:app.kubernetes.io/name: {}
f:spec:
f:containers:
k:{"name":"agvxxx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
k:{"containerPort":3001,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:resources:
f:limits:
f:cpu: {}
f:memory: {}
f:requests:
f:cpu: {}
f:memory: {}
f:securityContext:
f:capabilities:
f:add: {}
f:drop: {}
f:volumeMounts:
k:{"mountPath":"/disc"}:
.: {}
f:mountPath: {}
f:name: {}
f:restartPolicy: {}
f:serviceAccountName: {}
f:volumes:
k:{"name":"agvxxx-db-disk"}:
.: {}
f:emptyDir: {}
f:name: {}
- manager: kube-controller-manager
operation: Update
apiVersion: apps/v1
time: '2022-07-01T12:49:35Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:availableReplicas: {}
f:collisionCount: {}
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:readyReplicas: {}
f:replicas: {}
f:updatedReplicas: {}
subresource: status
selfLink: /apis/apps/v1/namespaces/xxxgraph/deployments/agvxxx
status:
observedGeneration: 1
replicas: 1
updatedReplicas: 1
readyReplicas: 1
availableReplicas: 1
conditions:
- type: Available
status: 'True'
lastUpdateTime: '2022-07-01T12:49:35Z'
lastTransitionTime: '2022-07-01T12:49:35Z'
reason: MinimumReplicasAvailable
message: Deployment has minimum availability.
- type: Progressing
status: 'True'
lastUpdateTime: '2022-07-01T12:49:35Z'
lastTransitionTime: '2022-07-01T12:49:22Z'
reason: NewReplicaSetAvailable
message: ReplicaSet "agviewer-687c4b5dbd" has successfully progressed.
collisionCount: 1
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: agvxxx
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/name: agvxxx
spec:
volumes:
- name: agvxxx-db-disk
emptyDir: {}
containers:
- name: agvxxx
image: bitnine/agviewer:latest
ports:
- name: agvxxx-port
containerPort: 3001
protocol: TCP
resources:
limits:
cpu: 150m
memory: 150Mi
requests:
cpu: 50m
memory: 80Mi
volumeMounts:
- name: agvxxx-db-disk
mountPath: /disc
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
add:
- SETPCAP
- MKNOD
- AUDIT_WRITE
- CHOWN
- DAC_OVERRIDE
- FOWNER
- FSETID
- KILL
- SETGID
- SETUID
- NET_BIND_SERVICE
- SYS_CHROOT
- SETFCAP
drop:
- NET_RAW
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
serviceAccountName: xxxgraph-sa
serviceAccount: xxxgraph-sa
securityContext: {}
schedulerName: default-scheduler
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 25%
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
note: default replicas = 1 in Deployment.
I just wanted to report that I'm also having the same problem. I have a single node cluster so I only ever define 1 replica if the chart pre-defines more. These pods even stay deployed when i comment out the source in the kustomization.yaml and prune is set to true. I'm running k3s v1.24.2+k3s1
---
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: monitoring
resources:
- namespace.yaml
- blackbox-exporter
- grafana
- kube-prometheus-stack
# - loki
- network-ups-tools
- snmp-exporter
- speedtest-exporter
- thanos
# - vector
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: cluster-apps
namespace: flux-system
spec:
interval: 10m
retryInterval: 1m
dependsOn:
- name: flux-config
- name: flux-repositories
- name: cluster-storage
path: ./cluster/apps
prune: true
wait: true
sourceRef:
kind: GitRepository
name: home-ops
decryption:
provider: sops
secretRef:
name: sops-age
postBuild:
substitute: {}
substituteFrom:
- kind: ConfigMap
name: cluster-config
- kind: Secret
name: cluster-secrets
loki and vector are still deployed and loki gateway has some extra pods. there were way more, but i manually deleted.
mb-pro home-ops % k get pod -n monitoring
NAME READY STATUS RESTARTS AGE
loki-gateway-5b47558b5d-p72fn 0/1 Pending 0 19m
loki-gateway-5b47558b5d-qtr9s 0/1 Pending 0 20m
loki-gateway-7b8965d9-486bs 1/1 Running 0 25m
loki-read-0 1/1 Running 2 (86m ago) 23h
loki-write-0 1/1 Running 0 23h
vector-agent-hg4g2 1/1 Running 0 23h
vector-aggregator-8489db7599-zzrlf 1/1 Running 0 23h
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
meta.helm.sh/release-name: loki
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2022-07-03T20:59:45Z"
generation: 1
labels:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: loki
app.kubernetes.io/version: 2.5.0
helm.sh/chart: loki-simple-scalable-1.6.1
helm.toolkit.fluxcd.io/name: loki
helm.toolkit.fluxcd.io/namespace: monitoring
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:meta.helm.sh/release-name: {}
f:meta.helm.sh/release-namespace: {}
f:labels:
.: {}
f:app.kubernetes.io/component: {}
f:app.kubernetes.io/instance: {}
f:app.kubernetes.io/managed-by: {}
f:app.kubernetes.io/name: {}
f:app.kubernetes.io/version: {}
f:helm.sh/chart: {}
f:helm.toolkit.fluxcd.io/name: {}
f:helm.toolkit.fluxcd.io/namespace: {}
f:spec:
f:progressDeadlineSeconds: {}
f:replicas: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:annotations:
.: {}
f:checksum/config: {}
f:labels:
.: {}
f:app.kubernetes.io/component: {}
f:app.kubernetes.io/instance: {}
f:app.kubernetes.io/name: {}
f:spec:
f:affinity:
.: {}
f:podAntiAffinity:
.: {}
f:requiredDuringSchedulingIgnoredDuringExecution: {}
f:containers:
k:{"name":"nginx"}:
.: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
.: {}
k:{"containerPort":8080,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:name: {}
f:protocol: {}
f:readinessProbe:
.: {}
f:failureThreshold: {}
f:httpGet:
.: {}
f:path: {}
f:port: {}
f:scheme: {}
f:initialDelaySeconds: {}
f:periodSeconds: {}
f:successThreshold: {}
f:timeoutSeconds: {}
f:resources: {}
f:securityContext:
.: {}
f:allowPrivilegeEscalation: {}
f:capabilities:
.: {}
f:drop: {}
f:readOnlyRootFilesystem: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:volumeMounts:
.: {}
k:{"mountPath":"/docker-entrypoint.d"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/etc/nginx"}:
.: {}
f:mountPath: {}
f:name: {}
k:{"mountPath":"/tmp"}:
.: {}
f:mountPath: {}
f:name: {}
f:dnsPolicy: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext:
.: {}
f:fsGroup: {}
f:runAsGroup: {}
f:runAsNonRoot: {}
f:runAsUser: {}
f:serviceAccount: {}
f:serviceAccountName: {}
f:terminationGracePeriodSeconds: {}
f:volumes:
.: {}
k:{"name":"config"}:
.: {}
f:configMap:
.: {}
f:defaultMode: {}
f:name: {}
f:name: {}
k:{"name":"docker-entrypoint-d-override"}:
.: {}
f:emptyDir: {}
f:name: {}
k:{"name":"tmp"}:
.: {}
f:emptyDir: {}
f:name: {}
manager: helm-controller
operation: Update
time: "2022-07-03T20:59:45Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:deployment.kubernetes.io/revision: {}
f:status:
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:replicas: {}
f:unavailableReplicas: {}
f:updatedReplicas: {}
manager: k3s
operation: Update
subresource: status
time: "2022-07-03T21:01:41Z"
name: loki-gateway
namespace: monitoring
resourceVersion: "87859182"
uid: c33455e5-be1d-4f29-bb4e-060667f0e5ba
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
app.kubernetes.io/name: loki
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
checksum/config: 30708602e8524ee1393920b700317783b5a69b3b787c2535cc1153a518260d8d
creationTimestamp: null
labels:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
app.kubernetes.io/name: loki
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/component: gateway
app.kubernetes.io/instance: loki
app.kubernetes.io/name: loki
topologyKey: kubernetes.io/hostname
containers:
- image: docker.io/nginxinc/nginx-unprivileged:1.19-alpine
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /
port: http
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/nginx
name: config
- mountPath: /tmp
name: tmp
- mountPath: /docker-entrypoint.d
name: docker-entrypoint-d-override
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 101
runAsGroup: 101
runAsNonRoot: true
runAsUser: 101
serviceAccount: loki
serviceAccountName: loki
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: loki-gateway
name: config
- emptyDir: {}
name: tmp
- emptyDir: {}
name: docker-entrypoint-d-override
status:
conditions:
- lastTransitionTime: "2022-07-03T20:59:45Z"
lastUpdateTime: "2022-07-03T21:00:05Z"
message: ReplicaSet "loki-gateway-574bc6f96" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2022-07-03T21:01:26Z"
lastUpdateTime: "2022-07-03T21:01:26Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
observedGeneration: 1
replicas: 1
unavailableReplicas: 1
updatedReplicas: 1
the more i mess with this i'm starting to think this not a flux issue....
@alokhom try running this tool. it's definitely finding some discrepancies in owner uid. my issue is not a flux issue, now to find the cause...