XRC -> XR Label Propagation Confuses ArgoCD
Edit from @negz:
In the case of an XR created by an XRC, ArgoCD sees a XR appear with the application ownership label (currently automatically propagated by Crossplane), but without an owner reference to any known resource
I believe, per @mcavoyk's comment above, that this is the key issue we're seeing here. ArgoCD thinks it (not Crossplane) created the XR, because when Crossplane creates the XR it propagates a label ArgoCD uses to track which resource it owns. This results in the XR either being shown as "out-of-sync" or being deleted depending on whether auto-prune is enabled.
Original issue follows.
Dear all, in the scenario we are testing we deploy on a k8s cluster crossplane and all the resources taked from platform-ref-gcp except the final resource of the example folder. These last resources for example triggering the Cluster creation was done using ArgoCD that formally connect to the k8s cluster and execute the yaml. Flagging the auto-pruning feature of ArgoCD leads to a sort of resource contentions between Crossplane and Argo:
Normal ConfigureCompositeResource 19s (x3 over 20s) offered/compositeresourcedefinition.apiextensions.crossplane.io Successfully applied composite resource Warning ConfigureCompositeResource 19s (x2 over 20s) offered/compositeresourcedefinition.apiextensions.crossplane.io cannot patch object: Operation cannot be fulfilled on compositenetworks.gcp.platformref.crossplane.io "test-network-d6xjh": the object has been modified; please apply your changes to the latest version and try again Normal BindCompositeResource 18s (x3 over 20s) offered/compositeresourcedefinition.apiextensions.crossplane.io Composite resource is not yet ready Warning BindCompositeResource 18s offered/compositeresourcedefinition.apiextensions.crossplane.io compositenetworks.gcp.platformref.crossplane.io "test-network-d6xjh" not found
and the Cluster was not created. Disabling auto-pruning indeed lead to the resource creation, but Argo remain out-of-sync. In my opinion the correct ways is to having all the crd and definiton of gcp-reference done by Argo, not only the last one. So I am not sure if is actually a bug or not, or simple an usage misunderstanding. Thanks, Ivan
Thanks for raising this issue @idallaserra! I'm not personally very familiar with ArgoCD. Can you help me understand exactly what the auto-pruning feature is trying to do? Is it trying to undo changes that Crossplane controllers are making to the kind: Cluster (for example is it trying to remove the spec.compositeRef.name and spec.resourceRef.name fields that Crossplane sets automatically?
From what I can tell, ArgoCD deletes the composed resources that are created by Crossplane in order to keep git repo fully authoritative and in-sync all the time. Though I wonder how it'd handle Pods created by Deployments, that's essentially the same mechanism with Crossplane creating managed resources when it sees the composite resource.
This is very similar to what I've been seeing with Argo - if you have prune enabled on an application, it will remove any resources it doesn't manage directly itself.
It can be avoided by applying argocd.argoproj.io/sync-options: Prune=false as an annotation to each resource that ArgoCD should not manage, but right now it isn't possible to apply that annotation to the Composite resource created from a Claim against an XRD (outlined in #2000). Argo will delete the Composite resource created from a Claim, causing Crossplane to part-create and then delete the underlying resources over and over.
Right now you have to disable prune on the Argo application itself, but with #2000 implemented you could set the annotation on the XRD, Composite and underlying resources so these are all managed by Crossplane, leaving the Claim resource managed by Argo.
If you keep Prune=false then ArgoCD will be always in "out of sync" mode, that is not something that you would like to see in a gitops approach
Thanks folks. It sounds like we should address this by finding a solution to https://github.com/crossplane/crossplane/issues/2000, which should let Argo know that it should manage the claim but not the resulting XR, and thus leave the XR alone. To @muvaf's point I think it's unrealistic to expect ArgoCD to directly manage every resource in the API server. Having ArgoCD model and manage a claim's XR and composed resources would be like having argo model and manage a Deployment's ReplicaSets and Pods.
Though I wonder how it'd handle Pods created by Deployments, that's essentially the same mechanism with Crossplane creating managed resources when it sees the composite resource.
My understanding from using ArgoCD and a quick scan through some of logic is that generated resources are not pruned if they have a known parent resource, e.g. the Pod is not pruned because in its ownership hierarchy is a Deployment that is part of the git repo. There are some exceptions made for Endpoints and for PersistentVolumeClaim's created by StatefulSets.
Creating a XR with ArgoCD doesn't experience this pruning issue, as all the generated resources may end up with the ArgoCD application ownership label (if metadata.labels is patched on the Compositions), but the generated provider resources all have ownership references set to the XR.
In the case of an XR created by an XRC, ArgoCD sees a XR appear with the application ownership label (currently automatically propagated by Crossplane), but without an owner reference to any known resource. Attempting to disable ArgoCD prune via their available annotation is not really effective in this case, as the primary means to apply it to the XR is through adding to the XRC, which means the XRC is unable to be deleted by ArgoCD.
@mcavoyk thank you for detailed explanation!
There are some exceptions made for Endpoints and for PersistentVolumeClaim's created by StatefulSets.
We want to come up with a good solution for this labeling stuff but I see that some exceptions are added for coreos operator group in ArgoCD. Do you think we can add an exception to cover claim <> composite relationship? The reason composite doesn't have an ownerRef is that claim is namespaced and composite is cluster-scoped and apiserver doesn't allow a namespaced resource to own a cluster-scoped one. The exception would look at spec.claimRef to figure out whether it's created by a known claim.
Hi @negz @muvaf ,
Recently i also encounter similar error message mentioned above in my composite resource (pardon me about the unconventional format of the k8s event logs as it is generated by the Alert tool):
Exactly 2 Warning Events generated in the namespace where the Claim Resource residing:
Description: Error Occurred in MyBucket: mynamespace/test-s3 in dev cluster
cannot update composite resource: Operation cannot be fulfilled on abc.com "test-s3-2j97g": the object has been modified; please apply your changes to the latest version and try again
abc.com/v1alpha1/mybuckets error
Description: Error Occurred in MyBucket: mynamespace/test-s3 in dev cluster
cannot patch object: Operation cannot be fulfilled on abc.com "test-s3-2j97g": the object has been modified; please apply your changes to the latest version and try again
I am currently using ArgoCD for Claim Resource deployment and also to overcome the auto-pruning issue, I have also included ArgoCD settings for resource.exclusions [Ref] https://argoproj.github.io/argo-cd/operator-manual/declarative-setup/#resource-exclusioninclusion] and applied it to the Composite and Managed Resource Kind. With this setup in place, ArgoCD wont do the auto-syncing (nor it will delete the object that is not managed by them). This method might help solving the ArgoCD behavior for @benagricola @idallaserra
My suspicion is if there are other Reconciler process that has updated the Composite Resource which causing its resourceversion to be different and became fail while the Claim Reconciler try to update it.
Also fyi, I don't recall having this error message before i upgraded to crossplane v1.0.0 and updated the xrd apiVersion to apiextensions.crossplane.io/v1.
@kelvinwijaya these errors should be safe to skip as long as they don't happen continuously since the claim controller will make the same update in the next reconcile with the updated composite object.
Hi @muvaf
Just to understand a bit on the problem, does it mean there are 2 controllers (composite & claim) that is trying to update the composite object as the same time which caused race condition?
Also are you trying to explain that eventually the claim controller will make sure the composite object to be updated during the binding process?
Thanks
does it mean there are 2 controllers (composite & claim) that is trying to update the composite object as the same time which caused race condition?
Yes, this is possible and intended. This error is thrown when the resource that controller A will update changes between the start of its reconciliation and the end. So, all setups where two controllers make any operation on the same resource is subject to this. Though, reconciliation makes sure the resource is consistent eventually.
The problem is actually:
- Argocd adds an annotation to every resource it creates directly.
- If a resource isn't in the manifest source that argo is syncing from, but does have an annotation, argo assumes that it was installed by argo, but it no longer is desired, so it wants to prune it.
- Crossplane is propogating annotations to resources it creates, including argocd annotations.
- Argo wants to prune everything that crossplane is generating.
I've compiled my findings into #2928
I believe it covers the reason for every issue described here.
Adding the annotation argocd.argoproj.io/compare-options: IgnoreExtraneous to the XRC works. The XR and MRs will show up in the UI, but won't be considered when evaluating the sync status
@James-Quigley I'm not sure how that would work. That tells ArgoCD to ignore the resource. So it will ignore the XRC too if I read docs correctly. When I set it, I see this. I think after re-reading the dpcs this is a setting that tells ArgoCD from the repo resource level, not to consider resources linked via owner ref chain in sync status.

After some number of syncs, it seems to be ok with the XRC, but the XR and beyond show out of sync and desired manifest blank.
If I delete the resource from ArgoCD, and then sync again, it doesn't come into UI. So seems similar to setting ignore at ArgoC project level, inly a lot of additional churn to get there.
Can you add some detail on how you've configured? I still see changing the way Crossplane is propagating labels as a better long-term solution as it doesn't require any additional configuration.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Revive
As of ArgoCD 2.4.8, the new annotation based tracking works with xr and xrc with auto pruning enabled. Application shows as synced and health is reflected correctly. XR does not get auto-pruned.
One caveat is that XRs will show as root resources instead of them being "children" of claims.
@negz I propose closing this issue based on @nabuskey comment.
Hi, @nabuskey did you use any additional configuration/tuning, I have a recent version of ArgoCD 2.4.11 and still struggling with the pruning of XRs.
Sorry, I missed the application.resourceTrackingMethod: annotation, with it, it's working great, thx.
@negz I propose closing this issue based on @nabuskey comment.
I think we could close this when #3041 is merged - that provides guidance on how to configure Argo, application.resourceTrackingMethod: annotation, etc. to work well with Crossplane 🤓
I had to enable clusterResourceWhitelist on the specific group I'm using on my AppProjet so that I can see the Composite Resource in ArgoCD. But if I understand correctly, this would also allow my users to explicitly declare Composite Resources or even Managed Resources without using Claims. Is it true? Is there any workaround?
This doesn't seem like a reasonable fix, shouldn't RD Claims that generate XRD's propagate ownership references, along with claim references. Wouldn't this correct the hierarchy issue in Argo and bring XRD's into scope of the project? At the moment, it seems that an misconfiguration in Argo could wipe out infrastructure using crossplane.
FunctionRevsions for example correctly specify the owner reference and link to the Function
apiVersion: pkg.crossplane.io/v1beta1
kind: FunctionRevision
metadata:
creationTimestamp: '2024-01-12T10:50:59Z'
finalizers:
- revision.pkg.crossplane.io
generation: 1
labels:
pkg.crossplane.io/package: terraform-example
name: terraform-example-714659823149
ownerReferences:
- apiVersion: pkg.crossplane.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: Function
name: terraform-example
uid: c50856f8-9a89-454a-9cfc-103057d92005