argo-cd
argo-cd copied to clipboard
Does not work ignoreResourceUpdates
Checklist:
- [+] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- [+] I've included steps to reproduce the bug.
- [+] I've pasted the output of
argocd version
.
Describe the bug
Hello! I have a problem with a Reconciliation loop
, due to the fact that some resources are constantly changing. Requesting app refresh caused by object update
Following your instructions, I found out which resource allows constant updates. https://argo-cd.readthedocs.io/en/stable/operator-manual/reconcile/#finding-resources-to-ignore
This is config-map kops-controller-leader
in namespace kube-system
its metadata is constantly changing
control-plane.alpha.kubernetes.io/leader: >-
{"holderIdentity":"ip-**-**-**-**_*********************","leaseDurationSeconds":15,"acquireTime":"2023-08-29T09:27:07Z","renewTime":"2023-09-20T12:56:56Z","leaderTransitions":0}
which leads to a refresh of all argocd applications approximately every second I tried adding exceptions to argocd-cm as in the documentation but it still generates millions of updates per day https://argo-cd.readthedocs.io/en/stable/operator-manual/argocd-cm-yaml/
resource.customizations.ignoreDifferences.all: |
jqPathExpressions:
- '.metadata.annotations."control-plane.alpha.kubernetes.io/leader"'
- .metadata.resourceVersion
managedFieldsManagers:
- kube-controller-manager
- external-secrets
jsonPointers:
- /spec/replicas
- /metadata/resourceVersion
- /metadata/annotations/control-plane.alpha.kubernetes.io~1leader
resource.customizations.ignoreResourceUpdates._ConfigMap: |
jqPathExpressions:
- '.metadata.annotations."control-plane.alpha.kubernetes.io/leader"'
- .metadata.resourceVersion
resource.customizations.ignoreResourceUpdates.all: |
jqPathExpressions:
- '.metadata.annotations."control-plane.alpha.kubernetes.io/leader"'
- .metadata.resourceVersion
jsonPointers:
- /status
- /metadata/resourceVersion
- /metadata/annotations/control-plane.alpha.kubernetes.io~1leader
resource.ignoreResourceUpdatesEnabled: 'true'
Screenshots
Version
v.2.8.3
Same here, it slow down all argocd operations. The weird thing is the config-map is not tracked by argo-cd, it is created by a controller, so I don't understand why argocd watch it. Maybe because i activated orphaned resources in projects....
Had the same problem, completely removing orphanedResources
option from the main AppProject
helped
@kollad It worked. Thank you!
I don't think this should be closed, this behavior is still a bug.
@duizabojul Ok I will reopen
Had the same problem, completely removing
orphanedResources
option from the mainAppProject
helped
@kollad Not getting what needs to change exactly. Can you please elaborate what exactly need to change?
We have the same situation with one Elasticsearch operator ConfigMap for leader election. Also EndpointSlice's are creating a lot of reconciliations.
We tried to ignore the updates as on the configmap the annotation for the leader is updating as well as the resourceVersion
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"elastic-operator-0_098489e3-3b66-4d7b-b17e-ac1555175d69","leaseDurationSeconds":15,"acquireTime":"2023-12-28T16:08:12Z","renewTime":"2024-01-05T11:53:17Z","leaderTransitions":179}'
creationTimestamp: "2022-06-10T11:38:35Z"
name: elastic-operator-leader
namespace: elastic-operator
resourceVersion: "815237253"
uid: c876a2c1-efd0-4902-970c-02d4e2531b81
on the EndpointSlice's the annotation renewTime
and also the resourceVersion
are constantly changing
--- /tmp/first.yaml 2024-01-05 13:07:44.109295084 +0100
+++ /tmp/second.yaml 2024-01-05 13:07:54.485291050 +0100
@@ -19,7 +19,7 @@
acquireTime: "2023-06-14T09:09:12.099141+00:00"
leader: some-service-name-0
optime: "4445962240"
- renewTime: "2024-01-05T12:07:36.500017+00:00"
+ renewTime: "2024-01-05T12:07:46.499734+00:00"
transitions: "3"
ttl: "30"
creationTimestamp: "2023-06-14T09:09:13Z"
@@ -42,7 +42,7 @@
kind: Endpoints
name: some-service-name
uid: cc85b6fa-178d-4364-aa52-cb270b8ef44d
- resourceVersion: "815254315"
+ resourceVersion: "815254479"
uid: 56c48541-440c-4f49-9783-8e5c5338e72d
ports:
- name: postgresql
(side node, the EndpointSlice is probably anyways to be ignored but see https://github.com/argoproj/gitops-engine/pull/469)
We tried to ignore these updates with this config:
# SEE:
# documentation: https://argo-cd.readthedocs.io/en/release-2.8/operator-manual/reconcile/
# example config: https://argo-cd.readthedocs.io/en/stable/operator-manual/argocd-cm-yaml/
resource.ignoreResourceUpdatesEnabled: "true"
resource.customizations.ignoreResourceUpdates.all: |
jsonPointers:
- /metadata/resourceVersion
resource.customizations.ignoreResourceUpdates.ConfigMap: |
jqPathExpressions:
# ElasticOperator is updating this around 2 times per second
- '.metadata.annotations."control-plane.alpha.kubernetes.io/leader"'
resource.customizations.ignoreResourceUpdates.discovery.k8s.io_EndpointSlice: |
jsonPointers:
# EndpointSlices should be ignored completely as Endpoints are already
# (see: https://github.com/argoproj/gitops-engine/pull/469) so until this is
# done automatically the ignorance of `/metadata/resourceVersion` for all resources
# plus ignoring this annotation should reduce the amount of updates significantly
- /metadata/annotations/renewTime
so either our config is not correct or the feature is not working on these resources.. they are both "orphaned" resources, so maybe the feature actually doesn't work on non-managed-resources?
@Sathish-rafay I think hat @kollad meant is https://argo-cd.readthedocs.io/en/stable/user-guide/orphaned-resources/ so removing the setting altogether helped him, as most updates come from these non-managed resources which update constantly
so either our config is not correct or the feature is not working on these resources.. they are both "orphaned" resources, so maybe the feature actually doesn't work on non-managed-resources?
Just checked, the ConfigMap
is an "orphanedResource" but the EndpointSlice
is supposedly not. But I guess the latter is tracked via the OwnerReference to the Endpoint
which is in theory also not managed by ArgoCD but I bet (but don't know) that ArgoCD knows that the managed Service
will create an Endpoint
and automatically tracks that as "this is managed".
But indepedently if ArgoCD things the EndpointSlice
is managed or not, the updates aren't ignored (at least that's what we saw in the debug
logs on the application controller pod).
Same for me ignoreResourceUpdates do not work on Orphaned Resources
resource.customizations.ignoreResourceUpdates.autoscaling.k8s.io_VerticalPodAutoscalerCheckpoint: |
jsonPointers:
- /status
resource.ignoreResourceUpdatesEnabled: 'true'
I still see requesting app refresh after updated the configmap :
{"api-version":"autoscaling.k8s.io/v1","application":"argocd/poc-idp","cluster-name":"anthos-test-nprd","fields.level":1,"kind":"VerticalPodAutoscalerCheckpoint","level":"debug","msg":"Requesting app refresh caused by object update","name":"poc-idp-wordpress","namespace":"poc-idp","server":"https://XXXXX.central-1.amazonaws.com","time":"2024-02-06T17:11:33Z"}
So - I'm curious about this one... I understand how to ignore individual field updates on certain types of objects, but we operate a very fast-moving kubernetes cluster that launches between 200 and 400k pods daily. When we look at the log entries for "Requesting app refresh caused by object update" - we can see that we are getting 25 new pod updates per second.
Is there some way to make ArgoCD ignore Pod/EndpointSlice changes for the purpose of manifest comparison?
We are experiencing the same with resources managed by an operator
Is there some way to make ArgoCD ignore Pod/EndpointSlice changes for the purpose of manifest comparison?
Just out of curiosity (for EndpointSlice
's): did you try to do the two things?
- disable
orphanedResources
on yourAppProject
's - ignore
EndpointSlice
's, like as an example:
resource.customizations.ignoreResourceUpdates.discovery.k8s.io_EndpointSlice: |
jsonPointers:
- /metadata/annotations/renewTime
- /metadata/resourceVersion
I am not sure if this will help in ignoring newly created things by an HPA but it would be interesting if it reduces it somehow.
We have the same issue. ArgoCD Events App is creating the following configMap:
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"controller-manager-c8d4c76d-f6x4w_3c1ec9c1-9b77-43bb-a943-5b26247a33b6","leaseDurationSeconds":15,"acquireTime":"2024-03-03T19:44:57Z","renewTime":"2024-03-03T20:34:57Z","leaderTransitions":1}'
creationTimestamp: "2024-03-03T19:44:38Z"
name: argo-events-controller
namespace: argo-events
resourceVersion: "88954276"
uid: 6a4dfeca-a51d-406a-a26b-486b3539313e
The renewTime of metadata.annotations."control-plane.alpha.kubernetes.io/leader"
and the 'resourceVersion' always changes every few seconds.
As long as we have the following AppProject config, we have a reconciliation loop of ArgoEvents every few seconds:
orphanedResources:
warn: false
This leads to higher CPU usage of the Argocd-Application-Controller.
Also tried the "resource.ignoreResourceUpdates" inside the argocd-cm without any success (https://github.com/argoproj/argo-cd/issues/15594#issuecomment-1878577773)
Following the Debug-Log which shows that the reconciliation is triggered by this ConfigMap "argo-events-controller" from namespace argo-events:
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=debug msg="Checking if cluster https://kubernetes.default.svc with clusterShard 0 should be processed by shard 0"
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=debug msg="Requesting app refresh caused by object update" api-version=v1 application=argocd/argo-events cluster-name= fields.level=1 kind=ConfigMap name=argo-events-controller namespace=argo-events server="https://kubernetes.default.svc"
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=info msg="Refreshing app status (controller refresh requested), level (1)" application=argocd/argo-events
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=info msg="Comparing app state (cluster: https://kubernetes.default.svc, namespace: argo-events)" application=argocd/argo-events
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=info msg="No status changes. Skipping patch" application=argocd/argo-events
argocd-application-controller-0 argocd-application-controller time="2024-03-03T20:40:33Z" level=info msg="Reconciliation completed" application=argocd/argo-events dedup_ms=0 dest-name= dest-namespace=argo-events dest-server="https://kubernetes.default.svc" diff_ms=1 fields.level=1 git_ms=20 health_ms=1 live_ms=1 patch_ms=0 setop_ms=0 settings_ms=0 sync_ms=0 time_ms=43
Removing the orphanedResources inside the AppProject is a workaround, but I am surprised why orphaned resources trigger a reconciliation. It looks like a bug to me.
I ended up completely excluding VPA with resource.exclusions
Same problem for me with argo-cd and istio which maintains several configmaps with control-plane.alpha.kubernetes.io/leader
settings.
The ignoreResourceUpdates
definition from https://github.com/argoproj/argo-cd/issues/15594#issue-1905156183 worked for me only after removing all orphanedResources
from my projects and restarting the application-controller.
So why is orphanedResources
interfering with these ignoreResourceUpdates
definitions?
So this is really interesting to me - this issue is really old, and still happening. We just noticed that even though we have the /status
field ignored on all of our resources, we still see every few seconds a HorizontalPodAutoscaler
object update triggers a reconciliation:
resource.customizations.ignoreResourceUpdates.all: |
jsonPointers:
- /status
time="2024-06-14T15:42:35Z" level=debug msg="Requesting app refresh caused by object update" api-version=autoscaling/v2 application=argocd-system/... cluster-name= fields.level=0 kind=HorizontalPodAutoscaler name=.... namespace=otel server="https://kubernetes.default.svc"
Using kubectl-grep we watch the HPA object and the diff's are all in the supposedly ignored fields:
apiVersion: "autoscaling/v2"
kind: "HorizontalPodAutoscaler"
metadata:
creationTimestamp: "2024-06-03T02:55:49Z"
name: "otel-collector-metrics-processor-collector"
namespace: "otel"
ownerReferences:
-
apiVersion: "opentelemetry.io/v1beta1"
blockOwnerDeletion: true
controller: true
kind: "OpenTelemetryCollector"
name: "...."
uid: "f131b749-c70a-4fc9-a4e2-21aea2023410"
- resourceVersion: "221899671"
+ resourceVersion: "221900017"
uid: "a5432460-837e-4a89-85dd-1177034cf993"
spec:
...
status:
conditions:
-
lastTransitionTime: "2024-06-03T02:56:04Z"
message: "recommended size matches current size"
reason: "ReadyForNewScale"
status: "True"
type: "AbleToScale"
-
lastTransitionTime: "2024-06-13T02:17:45Z"
message: "the desired replica count is less than the minimum replica count"
reason: "TooFewReplicas"
status: "True"
type: "ScalingLimited"
-
lastTransitionTime: "2024-06-11T08:44:44Z"
- message: "the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)"
+ message: "the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)"
reason: "ValidMetricFound"
status: "True"
type: "ScalingActive"
currentMetrics:
-
resource:
current:
- averageUtilization: 18
+ averageUtilization: 21
- averageValue: "376m"
+ averageValue: "425m"
name: "cpu"
type: "Resource"
-
resource:
current:
- averageUtilization: 12
+ averageUtilization: 11
- averageValue: "394474837333m"
+ averageValue: "367874048"
name: "memory"
type: "Resource"
currentReplicas: 3
desiredReplicas: 3
lastScaleTime: "2024-06-09T23:33:28Z"
The above update should not be triggering a reconciliation because it's only updating the /status
and /metadata/resourceVersion
fields. Our configuration explicitly ignores /status
and according to the docs the other field should be ignored too:
By default, the metadata fields generation, resourceVersion and managedFields are always ignored for all resources.
Following up on this - we see the same update behavior for all DaemonSets
... any time a new pod is started, the /status
field is updated... these shoudl be ignored, but they aren't and it triggers the app to be updated.
Looking at the code and from some experiments, it seems that this configuration only works for objects that are directly managed by argocd (applied to to the cluster from the manifest). This configuration doesn't work for objects that are in the resource tree but not directly tracked by ArgoCD.
One alternate thought ive had while trying to debug why some of our helms get stuck in this issue is we could add some sort of argocd.argoproj.io/skip-reconcile-time: '300'
in theory this would be some sort of number set on each application and if it has been under this time then simply skip.
i.e argocd.argoproj.io/skip-reconcile-time: '300'
would result in only 1 refresh in 5 minutes no matter what
I suppose the only exception we may want is manual refreshes will always run.
This would bet better then simply marking an application as skip entirely as it would atleast keep some sort of status/progression while not hamstringing the application server
@diranged this is happening to me also. I have few fields ignored but the resources still syncing.
For instance
resource.customizations.ignoreResourceUpdates.keda.sh_ScaledObject: |
jsonPointers:
- /metadata/resourceVersion
- /spec/triggers
- /spec/cooldownPeriod
- /spec/pollingInterval
- /status/lastActiveTime
Not sure what to do next. I am using 2.10.3 version