argo-cd
argo-cd copied to clipboard
sync is not working with ignoreDifferences and RespectIgnoreDifferences=true
Checklist:
- [x] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
- [x] I've included steps to reproduce the bug.
- [x] I've pasted the output of
argocd version
.
Describe the bug
I have an application where I want to ignore the image of the deployment and ignore it from being synced, I can see that the image is ignored and there is new env variables to be added in app diff but once I sync it it succeeds without applying the env variables like it's ignored with the image itself and I still see them in app diff!. This is the application:
project: default
source:
repoURL: '[email protected]:x/x.git'
path: kubernetes/production
targetRevision: master
plugin:
name: argocd-vault-plugin
destination:
server: 'https://x.x.x.x'
syncPolicy:
syncOptions:
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: apps
kind: Deployment
name: service
jqPathExpressions:
- >-
.spec.template.spec.containers[] | select(.name == "service").image
To Reproduce
Apply an application with ignoreDifferences like the application above and add new changes inside the same container specs The new specs will not be applied after sync.
Expected behavior
I expect the image to be ignored and the new env variables to be applied/synced
Version
❯ argocd version --grpc-web
argocd: v2.2.7+b8e154f
BuildDate: 2022-03-09T01:07:22Z
GitCommit: b8e154f767412172bb0c2b41131460c34d2a0c73
GitTreeState: clean
GoVersion: go1.16.11
Compiler: gc
Platform: linux/amd64
argocd-server: v2.3.4+ac8b7df
BuildDate: 2022-05-18T11:41:37Z
GitCommit: ac8b7df9467ffcc0920b826c62c4b603a7bfed24
GitTreeState: clean
GoVersion: go1.17.10
Compiler: gc
Platform: linux/amd64
Ksonnet Version: v0.13.1
Kustomize Version: v4.4.1 2021-11-11T23:36:27Z
Helm Version: v3.8.0+gd141386
Kubectl Version: v0.23.1
Jsonnet Version: v0.18.0
I am also experiencing this issue
We are also seeing the issue, we have two ignore differences on Deployments, one for spec.replicas
using jsonPointers
as these can be changed by HPA and other external processes, and one for the image .spec.template.spec.containers[].image
using a jqPathExpression
due to the array of containers
(see below).
Through experimentation I've found that if you change .spec.template.spec.containers[0].resources.memory.limits
to a different value, then OutOfSync
appears, yet when you click Sync
with RespectIgnoreDifferences=true
the Sync task sees nothing to change and marks as complete but the App still shows OutOfSync
. When auto sync is enabled this causes infinite syncs.
I then revert the change in the repo.
If I change .spec.template.metadata.labels
by adding a new label, then OutOfSync
appears and when I press Sync
with RespectIgnoreDifferences=true
the Sync applies the change and maintains the image.
So my assumption is that using an array, potentially only a 'wildcard' array, within JQ breaks the Sync logic
ignoreDifferences:
- group: apps
jsonPointers:
- /spec/replicas
kind: Deployment
- group: apps
jqPathExpressions:
- .spec.template.spec.containers[].image
kind: Deployment
Did a further experiment and found that it doesn't appear to be just the JQ expression but in fact anything with an array.
Knowing we only tend to have single container Pods I changed the jqPathExpressions
to jsonPointers
using /0
in place of the []
and found an incomplete sync when changing the memory limit as above
ignoreDifferences:
- group: apps
jsonPointers:
- /spec/replicas
kind: Deployment
- group: apps
jsonPointers:
- /spec/template/spec/containers/0/image
Facing also the same issue. Here is a simple PoC of the error to reproduce it:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: ingress-nginx-testing
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: 'https://charts.bitnami.com/bitnami'
targetRevision: 13.2.13
chart: nginx
helm:
version: v3
values: |
image:
tag: "1.23.2-debian-11-r2"
destination:
namespace: default
server: https://kubernetes.default.svc
syncPolicy:
syncOptions:
- ApplyOutOfSyncOnly=true
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: apps
kind: Deployment
jqPathExpressions:
- .spec.template.spec.containers[] | select(.name=="nginx") | .env[] | select(.name=="BITNAMI_DEBUG") | .value
If you change the image tag from 1.23.2-debian-11-r2
to 1.23.2-debian-11-r1
the app will be shown as out-of-sync. If you sync now the pod will not be replaced and after the sync the diff is still there:
This occurs on Argo 2.4.8 and 2.4.14, also on the newest version 2.5.2.
Same behavior on arrays if managedFieldsManagers
is used. ArgoCD 2.4.14
syncPolicy:
syncOptions:
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: apps
kind: Deployment
managedFieldsManagers:
- openshift-controller-manager
OpenShift allows triggering image stream updates directly on the Deployment and updates the container image. Was easier to use the manager name, so the difference is only ignored, if the trigger is used in our apps.
Confirmed this is also not working with 2.5.2. We are have an ignore like this
ignoreDifferences:
- group: certmanager.io
kind: Certificate
jsonPointers:
- /spec/dnsNames/2
We see the ignore works correctly, but the 3rd dns name is removed when syncing with RespectIgnoreDifferences
. We also tried /spec/dnsNames
in case there is an oddity with arrays and had the same behavior. Our source file has this for the item being removed for live state:
spec:
dnsNames:
- in git
- also in git
- manually added and needs to be ignored
And desired state
spec:
dnsNames:
- in git
- also in git
any updates on this issue? got the same problem :/
I'm facing the same issue with the list of environment variables for a container. Any updates on this issue? Or even a work around? 🙁
We are using some operator that is changing the env vars of the pod in the deployment. So now, because of this issue, argo rolling back operator changes to its defaults on every sync (and then operator changes it again). So we have a restart of the deployment on every change of the application, even when it is not related to mentioned deployments at all.
@Len4i did you forget to configure ignoreDifferences
in your apps so the operator changes are at least ignored with RespectIgnoreDifferences=true
?
@QuingKhaos, I'll elaborate
We have deployment applied with argo, on the app configured ignoreDifferences
ignoreDifferences:
- group: apps
kind: Deployment
name: some-deployment
jqPathExpressions:
- '.spec.template.spec.containers[] | select(.name == "some-container").env[] | select(.name == "_JAVA_OPTIONS")'
at this stage behavior is following:
on every change in the app, argocd does "full" sync, overriding env var to what it see in the git, as it happens, the operator sees changes and applies its patch to env var. Argocd according to ignoreDifferences
not doing anything with this change. Until a change in any other place of the app.
After adding RespectIgnoreDifferences=true
behaviors is the following:
argocd not triggering sync at all when change is related to anything inside '.spec.template.spec.containers[]
.
the workaround of syncing from UI with removing V
from RespectIgnoreDifferences
is working, but it is "not the GitOps that we want" :)
the workaround of syncing from UI with removing
V
fromRespectIgnoreDifferences
is working, but it is "not the GitOps that we want" :)
unfortunately this is the only viable workaround currently, to not end in a sync loop.
Same behavior on arrays if
managedFieldsManagers
is used. ArgoCD 2.4.14syncPolicy: syncOptions: - RespectIgnoreDifferences=true ignoreDifferences: - group: apps kind: Deployment managedFieldsManagers: - openshift-controller-manager
OpenShift allows triggering image stream updates directly on the Deployment and updates the container image. Was easier to use the manager name, so the difference is only ignored, if the trigger is used in our apps.
In my use case, if we update some value in .spec.template.spec.containers[]
, like .resources.limits.cpu
the diff for the change is not applied. But if we remove .resources.limits.cpu
, then the diff for the removal is applied correctly.
Problem still exists in ArgoCD v2.4.40
what are the chances that this issue is related?
i just saw that argo using gojq
and there was bug in compiler related to arrays that was fixed in 0.12.11
and argo 2.5 uses 0.12.9 at the latest
same issue on latest version
same issue on latest version
same here with the latest version :(
I have RespectIgnoreDifferences=true
and the following ignoreDifferences
:
ignoreDifferences:
- group: apps
kind: Deployment
jqPathExpressions:
- .spec.template.spec.containers[] | select(.name == "container1").resources
With this configuration if I add a new container in my deployment, application stay OutOfSync and never update the Deployment with the new container.
Same thing occurs with jsonPointers:
jsonPointers:
- /spec/template/spec/containers/0/resources
Interesting things occurs with managedFieldsManagers
:
managedFieldsManagers:
- myManager
With this managedFieldsManagers configuration, on a fresh deployment, if I add a container in my spec, it is added and synced correctly, then I make a change in spec.template.spec.containers
using my custom manager and after that if I try to add another container the problem occurs, app stay OutOfSync.
(If I make a change in another field than spec.template.spec.containers
, there is no problem)
Syncing from the UI do not work until I disable the RespectIgnoreDifferences
option.
Is there any update on this? Having this issue on the latest version
{
"Version": "v2.7.3+e7891b8.dirty",
"BuildDate": "2023-05-24T15:05:34Z",
"GitCommit": "e7891b899a35dca06ae94965ea5ae2a86b344848",
"GitTreeState": "dirty",
"GoVersion": "go1.19.6",
"Compiler": "gc",
"Platform": "linux/amd64",
"KustomizeVersion": "v5.0.1 2023-03-14T01:32:48Z",
"HelmVersion": "v3.11.2+g912ebc1",
"KubectlVersion": "v0.24.2",
"JsonnetVersion": "v0.19.1"
}
I had a similar issue where I couldn't sync because of a difference in an immutable field which I wanted to ignore. However, setting ServerSideApply=true
fixed the issue for me:
syncPolicy:
syncOptions:
- ServerSideApply=true
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: '*'
kind: PersistentVolumeClaim
jqPathExpressions:
- .spec.volumeName
Same here with 2.7.4
ignoreDifferences:
- group: admissionregistration.k8s.io
kind: MutatingWebhookConfiguration
name: aws-load-balancer-webhook
namespace: aws-lb-controller
jqPathExpressions:
- '.webhooks[]?.clientConfig.caBundle'
Still showing differences on the caBundle.
I see the gojq library in question was recently updated in main but doesnt appear to have made it into the 2.7.4 release. So hopefully it will go into the 2.7.5 release.
Same here with 2.7.6:
ignoreDifferences:
- group: batch
kind: CronJob
jqPathExpressions:
- .spec.jobTemplate.spec.template.spec.containers[] | select(.image|test(":0.19.*")).image
syncPolicy:
automated:
syncOptions:
- RespectIgnoreDifferences=true
- ApplyOutOfSyncOnly=true
Upon changing container command arguments, ArgoCD sync succeeds with unchanged
resource,
we are using argoproj-labs/argocd-image-updater,
that particular jq is about ignoring our double tagged images in our registry ( we do both minor 'latest' and patch update upon deployment, similar to how you would have python:3.11
equivalent to python:3.11.4
as of writing this )
the workaround of syncing from UI with removing
V
fromRespectIgnoreDifferences
is working, but it is "not the GitOps that we want" :)
indeed works, and the resource is updated
..
this is not GitOps.. 🫤
There may be more than one problem, but I think most of them can be traced back to this function: https://github.com/argoproj/argo-cd/blob/ef8dae885d988bad9033276cd27ccc28945f3433/controller/sync.go#L446
Imagine I want to modify a CronJob to add a new container.
To construct the patch, Argo CD takes the git state (templateMap
) and the live state (valueMap
) and merges them, using this intersectMap
function.
When intersectMap
encounters an array, it loops over the array from git (say, an array with two containers) and for each item merges the same array from the live object (say, an array with only one container). If the array in the live object is shorter than the array in the git object, intersectMap
just drops any remaining items from the git state. So the patch contains only the first container.
That's the answer for the specific case of "adding a new container." The general answer is: this patch function is custom and is not Kubernetes-aware. A Kubernetes-aware patch function would have merged the array on a merge key (like container name) if present.
That's the boring explanation of "what seems to be wrong." In order to fix everyone's use cases, I need details about what is failing:
- live state of the resource, in yaml
- desired state of the resource, in yaml
- ignoreDifferences rules
We have those for a few use cases above. Those can inform the unit tests we write for the fix. But the more cases we can cover, the better.
Closed https://github.com/argoproj/argo-cd/issues/8970 as a duplicate of this. Note that it has a failing use case which doesn't seem to involve arrays.
Also seeing this bug on v2.7.6
. In our case, we're trying to follow the recommendation by https://github.com/argoproj/argo-cd/issues/2367 -- app config looks as follows:
...
syncPolicy:
automated:
selfHeal: true
syncOptions:
- Validate=true
- CreateNamespace=false
- PrunePropagationPolicy=foreground
- PruneLast=true
- RespectIgnoreDifferences=true
ignoreDifferences:
- kind: Secret
name: argocd-secret
namespace: argocd
jsonPointers:
- /data/admin
- /data/server
- /data/tls
Pasting my comment from #8970
I have the same issue, in my case it is a CRITICAL issue because ArgoCD is wiping out the OpenShift's annotations which leads to changing UID and GID on PVC, where PostgreSQL instances are working. In effect after PostgreSQL is restarted (e.g. an upgrade) a production database cannot get up.
I do not see the changes in the diff - those are indeed ignored, but after sync... the annotation values are regenerated by OpenShift, PVC has old UID & GID but the Pod is starting with new UID & GID...
To mitigate the issue I am going to implement a Kyverno policy to block the update of OpenShift annotations by ArgoCD, but this would be only a workaround.
Using ArgoCD version: v2.5.5
Background
In a multi tenant environment I massively prepare environments for clients - namespaces, resourcequotas, various policies, etc. the kind: Namespace
objects are created by ArgoCD, then OpenShift applies its annotations on every new namespace.
The issue is running sync on an existing namespace - it makes existing OpenShift annotations to be erased and rewritten again with different UID and GID, which leads to changing Pod's UID & GID and that makes new Pods not compatible with existing PVCs.
I have a similar issue on Argo CD 2.6.4
. To repro, here's a spec using the Helm chart from aws-load-balancer-controller. This part initially works, but when bumping the Helm targetRevision
from 1.4.8
-> 1.5.3
, the Application does not sync.
spec:
destination:
server: 'https://some-server.eks.amazonaws.com/'
namespace: kube-system
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- RespectIgnoreDifferences=true
ignoreDifferences:
- group: admissionregistration.k8s.io
kind: MutatingWebhookConfiguration
jqPathExpressions:
- '.webhooks[]?.clientConfig.caBundle'
- group: admissionregistration.k8s.io
kind: ValidatingWebhookConfiguration
jqPathExpressions:
- '.webhooks[]?.clientConfig.caBundle'
- kind: Secret
name: aws-load-balancer-tls
jsonPointers:
- /data/ca.crt
- /data/tls.crt
- /data/tls.key
sources:
- repoURL: 'https://github.com/aws/eks-charts/eks'
targetRevision: 1.4.8
helm:
releaseName: aws-lb-controller
values: |
clusterName: some-cluster-name
serviceAccount:
create: true
name: aws-load-balancer-controller
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxxxxx:role/some-role-name
chart: aws-load-balancer-controller
The main difference between the two version is that ALBC introduces some changes in its ValidatingWebhookConfiguration
and MutatingWebhookConfiguration
, and instead of syncing cleanly, the app is constantly OutOfSync
.
I added a draft PR (see #14602), there are likely hidden dragons (I can't believe that the fix is that simple 😛). Feel free to add test cases which I can add to the unit tests.
I seem to remember that there were more dragons in there to do with string choice and nested lists etc which is why I ran out of investigation time and haven't been able to come back to it. I've pulled your branch and created one on my own fork. Not really sure how this works so will just leave it in my fork.
I've added a test case to simulate some of what we were seeing when we tried using it (bit rusty on this cause we simply didn't move to ArgoCD yet cause of this).
If I remember rightly live is what is in the cluster and target is what we expect.
Made two Deployment manifests, live and target.
Changed the following in the target:
metadata.labels
- added appProcess: web
label
spec.replicas
- changed to 2
spec.template.metadata.labels
- added appProcess: web
label
spec.template.spec.containers[0].image
- Changed the image tag
spec.template.spec.containers[0].resources
- Changed from empty object to adding requests.cpu: 400m
spec.template.spec.containers[0]
- Added `env: [{"name": "EV", "value": "here"}]
Expected the spec.replicas
and spec.template.spec.containers[0].image
to stay the same as live and the rest to be the value in target.
However spec.template.spec.containers[0].resources
remained an empty object and spec.template.spec.containers[0].env
was not added.
You can see my test at this commit and I'm happy for you to merge it into your branch. Apologies I don't have the time to go further into this but I hope this test case helps.
@si-c613 this is great, thank you!