argo-cd
argo-cd copied to clipboard
ArgoCD OutOfSync: sync loop due to 0.5CPU vs 500mCPU
Describe the bug ArgoCD detects configuration differences and it is attempting to sync when the limit and request are defined in mCPU units.
ArgoCD won't realize that 0.5 CPU is actually the same as 500 mCPU.
To Reproduce
- Deploy a pod with 500 CPU resource limit or required value
- Sync and observe the sync loop
- Go to the diff tab of the deployment and observe the difference.
Expected behavior ArgoCD should recognize that the 500 mCPU and the 0.5 CPU are actually the same values.
Screenshots
Additional context
- Observed with v2.10.4+f5d63a5
- ArgoCD gets stuck in a sync loop, since it thinks that the deployed manifest is not in sync and keeps on re-applying the manifest. [NOK]
This is mentioned in the docs, here, with a workaround: https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#known-kubernetes-types-in-crds-resource-limits-volume-mounts-etc
Thank you very much for mentioning workaround. I have realized myself without the docs that this could help and I have already implemented these changes right away. Although I am pretty confident that this behavior is not as expected and it would be nice if it would be possible to implement a proper fix to prevent the endless loop of syncing in such cases.
I can confirm; the workaround described in the docs works well. The loop is gone and I am not experiencing any issues after implementing this change, but I hope that it is actually possible to add proper support for CPU requests and limits using mCPU units.
So are you deploying a Pod
on its own, and not through one of the abstractions (Deployment
, StatefulSet
, Job
etc)?
The Pod
was deployed as part of a Kubernetes Deployment
.
So the OutOfSync state is actually on your Deployment resource, not on the Pod, right?
Duplicate and/or similar: https://github.com/argoproj/argo-cd/issues/16400
So the OutOfSync state is actually on your Deployment resource, not on the Pod, right?
Yes. The OutOfSync state was on my Deployment resource
I think, i experience the same issue after i added a patches section in my Kustomization file. The request CPU is not added by the patch and if i comment the patches section there is no more OutOfSync on my deployment.
ArgoCD version: v2.9.10
@4TT1L4 Can you share what your knownTypeFields
config looks like? The example in the docs is for an Argo Rollout, but I can't seem to get the Deployment spec to not show outofsync
knownTypeFields:
# example in docs
argoproj.io_Rollout: |
- field: spec.template.spec
type: core/v1/PodSpec
# doesnt work
k8s.io_Deployment: |
- field: spec.template.spec
type: core/v1/PodSpec
@clayvan @4TT1L4 Were you able to use this knownTypeFields to get k8s native deploy specs to show in sync? what is the field to go in argocd-cm exactly tried a few variations as well.
@ashinsabu3 I simply went with the quick workaround and changed the spec to match the expected unit.
I haven't tried your suggestion, since it was rather simple to implement the workaround which takes care of the out-of-sync issue, but it would be nice if there was not workaround needed here.
Yep same here. 1.0Gi becomes 1Gi in my values. Unfortunate but easy enough to workaround.
I tried this with 2.10.10 and I had some luck were these were not being reported in the diff but could not pin down the change that might have caused this to not happen.(I manually edited just the image for app-controller and it seemed to fix this, did later upgrade other components also to match this version)
@clayvan We had the same problem on our Deployments and managed to fix the issue using the knownTypeFields
parameter in the argo-cm
configmap:
resource.customizations.knownTypeFields.apps_Deployment: |
- field: spec.template.spec
type: core/v1/PodSpec
Workaround documented in https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#known-kubernetes-types-in-crds-resource-limits-volume-mounts-etc
Let's track a potential implementation in https://github.com/argoproj/argo-cd/issues/16400.