argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

ArgoCD OutOfSync: sync loop due to 0.5CPU vs 500mCPU

Open 4TT1L4 opened this issue 10 months ago • 9 comments

Describe the bug ArgoCD detects configuration differences and it is attempting to sync when the limit and request are defined in mCPU units.

ArgoCD won't realize that 0.5 CPU is actually the same as 500 mCPU.

To Reproduce

  1. Deploy a pod with 500 CPU resource limit or required value
  2. Sync and observe the sync loop
  3. Go to the diff tab of the deployment and observe the difference.

Expected behavior ArgoCD should recognize that the 500 mCPU and the 0.5 CPU are actually the same values.

Screenshots image

Additional context

  • Observed with v2.10.4+f5d63a5
  • ArgoCD gets stuck in a sync loop, since it thinks that the deployed manifest is not in sync and keeps on re-applying the manifest. [NOK]

4TT1L4 avatar Apr 01 '24 08:04 4TT1L4

This is mentioned in the docs, here, with a workaround: https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#known-kubernetes-types-in-crds-resource-limits-volume-mounts-etc

jgwest avatar Apr 02 '24 22:04 jgwest

Thank you very much for mentioning workaround. I have realized myself without the docs that this could help and I have already implemented these changes right away. Although I am pretty confident that this behavior is not as expected and it would be nice if it would be possible to implement a proper fix to prevent the endless loop of syncing in such cases.

I can confirm; the workaround described in the docs works well. The loop is gone and I am not experiencing any issues after implementing this change, but I hope that it is actually possible to add proper support for CPU requests and limits using mCPU units.

4TT1L4 avatar Apr 03 '24 06:04 4TT1L4

So are you deploying a Pod on its own, and not through one of the abstractions (Deployment, StatefulSet, Job etc)?

jannfis avatar Apr 05 '24 16:04 jannfis

The Pod was deployed as part of a Kubernetes Deployment.

4TT1L4 avatar Apr 05 '24 16:04 4TT1L4

So the OutOfSync state is actually on your Deployment resource, not on the Pod, right?

jannfis avatar Apr 05 '24 19:04 jannfis

Duplicate and/or similar: https://github.com/argoproj/argo-cd/issues/16400

crenshaw-dev avatar Apr 05 '24 20:04 crenshaw-dev

So the OutOfSync state is actually on your Deployment resource, not on the Pod, right?

Yes. The OutOfSync state was on my Deployment resource

4TT1L4 avatar Apr 20 '24 08:04 4TT1L4

I think, i experience the same issue after i added a patches section in my Kustomization file. The request CPU is not added by the patch and if i comment the patches section there is no more OutOfSync on my deployment.

ArgoCD version: v2.9.10

Wickasanew avatar Apr 23 '24 13:04 Wickasanew

@4TT1L4 Can you share what your knownTypeFields config looks like? The example in the docs is for an Argo Rollout, but I can't seem to get the Deployment spec to not show outofsync

        knownTypeFields:
          # example in docs
          argoproj.io_Rollout: |
            - field: spec.template.spec
              type: core/v1/PodSpec

          # doesnt work
          k8s.io_Deployment: |
            - field: spec.template.spec
              type: core/v1/PodSpec

clayvan avatar May 02 '24 20:05 clayvan

@clayvan @4TT1L4 Were you able to use this knownTypeFields to get k8s native deploy specs to show in sync? what is the field to go in argocd-cm exactly tried a few variations as well.

ashinsabu3 avatar Jun 26 '24 12:06 ashinsabu3

@ashinsabu3 I simply went with the quick workaround and changed the spec to match the expected unit.

I haven't tried your suggestion, since it was rather simple to implement the workaround which takes care of the out-of-sync issue, but it would be nice if there was not workaround needed here.

4TT1L4 avatar Jun 26 '24 15:06 4TT1L4

Yep same here. 1.0Gi becomes 1Gi in my values. Unfortunate but easy enough to workaround.

clayvan avatar Jun 26 '24 15:06 clayvan

I tried this with 2.10.10 and I had some luck were these were not being reported in the diff but could not pin down the change that might have caused this to not happen.(I manually edited just the image for app-controller and it seemed to fix this, did later upgrade other components also to match this version)

ashinsabu3 avatar Jul 26 '24 06:07 ashinsabu3

@clayvan We had the same problem on our Deployments and managed to fix the issue using the knownTypeFields parameter in the argo-cm configmap:

resource.customizations.knownTypeFields.apps_Deployment: |
   - field: spec.template.spec
     type: core/v1/PodSpec

slayne avatar Jul 31 '24 09:07 slayne

Workaround documented in https://argo-cd.readthedocs.io/en/stable/user-guide/diffing/#known-kubernetes-types-in-crds-resource-limits-volume-mounts-etc

Let's track a potential implementation in https://github.com/argoproj/argo-cd/issues/16400.

agaudreault avatar Sep 03 '24 12:09 agaudreault