argo-cd icon indicating copy to clipboard operation
argo-cd copied to clipboard

Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"autoscaling\"

Open mikebryant opened this issue 10 months ago • 55 comments

Checklist:

  • [x] I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • [ ] I've included steps to reproduce the bug.
  • [ ] I've pasted the output of argocd version.

Describe the bug

We've switched some applications to the new valuesObject syntax, and those apps have gotten "stuck". They go into a refreshing status and never finish

To Reproduce

Not clear to us how to reproduce. We deleted the entire .status of an affected application and it seemed to start working again. Not left it long enough yet to determine if it'll break again

Expected behavior

Screenshots

Version

argocd-server: v2.8.0+804d4b8
  BuildDate: 2023-08-07T14:25:33Z
  GitCommit: 804d4b8ca6bc4c2cf02c5c971aa923ec5b8623f0
  GitTreeState: clean
  GoVersion: go1.20.6
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.1.0 2023-06-19T16:58:18Z
  Helm Version: v3.12.1+gf32a527
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.20.0

Logs

{"application":"argocd/istio-gateway-iap","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:10Z"}
{"application":"argocd/istio-gateway-public","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:10Z"}
{"application":"argocd/istio-gateway-vpn","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:25Z"}
{"application":"argocd/istio-gateway-public","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:25Z"}
{"application":"argocd/istio-gateway-vpn","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:39Z"}
{"application":"argocd/istio-gateway-public","level":"error","msg":"Error constructing app status patch: unable to find api field in struct RawExtension for the json field \"service\"","time":"2023-08-21T09:01:40Z"}

mikebryant avatar Aug 21 '23 09:08 mikebryant

Application yaml for one exhibiting the problem: app.txt

mikebryant avatar Aug 21 '23 09:08 mikebryant

Found these issues that look relevant:

  • https://github.com/kubernetes/kubernetes/issues/58017
  • https://github.com/helm/helm/pull/3459

persistAppStatus uses CreateTwoWayMergePatch uses strategicpatch. Seems like this might not be happy with the unstructured content in valuesObject?

mikebryant avatar Aug 21 '23 09:08 mikebryant

I've bumped into this today as well, on 2 applications installing Istio and Mimir. I don't yet see any pattern, it seems to be happening randomly on various fields. I've checked the Application resources, and the status.helm.valuesObject looks good. I've deleted it, and the controller populates it, but still it gives the patching error.

This is Argo 2.8.0.

{
    "Version": "v2.8.0+804d4b8",
    "BuildDate": "2023-08-07T14:25:33Z",
    "GitCommit": "804d4b8ca6bc4c2cf02c5c971aa923ec5b8623f0",
    "GitTreeState": "clean",
    "GoVersion": "go1.20.6",
    "Compiler": "gc",
    "Platform": "linux/amd64",
    "KustomizeVersion": "v5.1.0 2023-06-19T16:58:18Z",
    "HelmVersion": "v3.12.1+gf32a527",
    "KubectlVersion": "v0.24.2",
    "JsonnetVersion": "v0.20.0"
}

vladfr avatar Aug 22 '23 09:08 vladfr

I managed to reproduce this consistently. This only happens if the status.sync.comparedTo contains an unquoted boolean. This is why the initial install works. But if you have an unquoted bool field, or you edit the Application and add an unquoted bool, then the Sync breaks.

Steps:

  1. Create an app with Helm source and valuesObject.
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: grafana
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://grafana.github.io/helm-charts
    targetRevision: 6.58.8
    chart: grafana

    helm:
      valuesObject:
        persistence:
          enabled: true

  destination:
    server: https://kubernetes.default.svc
    namespace: default

  syncPolicy:
    automated:
      prune: true
      selfHeal: true

⚠️ It is important not to quote the boolean value.

  1. After the app is installed and green, edit the file and disable persistence in Grafana:
    helm:
      valuesObject:
        persistence:
          enabled: false

⚠️ Again, it is important not to quote the boolean value.

  1. Apply the file

Expected: App syncs and refreshes correctly.

Actual Result: The app refresh breaks. Go to the UI and hit Refresh. The Refresh button will be greyed out. The error is visible in the controller logs.

I've managed to also reproduce this by running Argo locally.

Strategicmerge shouldn't be used for Unstructured data

This is an older issue also found in Helm, and fixed here. The fix is to use the json merge for Unstructured data, because it seems that the strategicmerge functionality isn't meant to be used on CRDs or other unstructured inputs. I'll work on a PR to address it this week.

Meanwhile, the workaround could be to quote all values, including numbers.

vladfr avatar Aug 23 '23 13:08 vladfr

Brilliantly diagnosed. Looking forward to the PR, thanks @vladfr!

crenshaw-dev avatar Aug 23 '23 14:08 crenshaw-dev

@crenshaw-dev added a PR, tests are passing, feedback is welcomed!

vladfr avatar Aug 25 '23 08:08 vladfr

is it going to be available in 2.8.5 ? Would def help a lot to work with valuesObject!

Jojoooo1 avatar Sep 15 '23 12:09 Jojoooo1

We encounter the problem too. This prevents us from using the valuesObject, which would simplify our deployment-pipelines a lot

joggeli34 avatar Sep 18 '23 07:09 joggeli34

@vladfr We also experience this problem. It would be great to be able to use valueObjects. Any idea when this will be merged?

HerrDerb avatar Sep 18 '23 08:09 HerrDerb

tbh I'm fairly concerned about changing the patch calculation mechanism in a patch release since it changes how every single reconciliation works.

I wonder if there's an interim solution to avoid the problem for just this field, but leave the rest of the patch calculation untouched. Then we can switch over to the new mechanism entirely in 2.9.

crenshaw-dev avatar Sep 18 '23 17:09 crenshaw-dev

Yes, that's my thinking as well. I'm working on the change for just this field, I should be ready soon.

On Mon, 18 Sept 2023, 20:01 Michael Crenshaw, @.***> wrote:

tbh I'm fairly concerned about changing the patch calculation mechanism in a patch release since it changes how every single reconciliation works.

I wonder if there's an interim solution to avoid the problem for just this field, but leave the rest of the patch calculation untouched. Then we can switch over to the new mechanism entirely in 2.9.

— Reply to this email directly, view it on GitHub https://github.com/argoproj/argo-cd/issues/15126#issuecomment-1723997541, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABT5B62FWOUEUJGHDUCMPDX3B45ZANCNFSM6AAAAAA3YB66W4 . You are receiving this because you were mentioned.Message ID: @.***>

vladfr avatar Sep 18 '23 17:09 vladfr

Is there any progress?

mgledi avatar Oct 25 '23 15:10 mgledi

Any target release planned ?

Jojoooo1 avatar Nov 06 '23 10:11 Jojoooo1

currently running on 2.8.5 and seeing this. Also seems to lock up refreshing and messes with application health status. I see that https://github.com/argoproj/argo-cd/pull/15227 hasn't been merged yet so I imagine this is still an issue in 2.9.X? Are there any plans on getting that merged?

Jrc356 avatar Dec 01 '23 04:12 Jrc356

I'm running app version v2.9.3+6eba5be, still the same issue. Nearly every app is stuck with "Refresh".

sitilge avatar Dec 04 '23 11:12 sitilge

I suppose there will be patch versions for 2.8 and 2.9 ?

Zeratoxx avatar Dec 06 '23 11:12 Zeratoxx

Using application sets to generate applications, but even quoting in the applicationset template all the valuesObject values is not fix for us, as all the quotes disappear in generated Application manifest. So seems we have to stay with values: only multiline string.

hau21um avatar Dec 11 '23 16:12 hau21um

I believe this PR needs to be modified to only calculate json patches (instead of strategic merge patches) for changes to the valuesObject field: https://github.com/argoproj/argo-cd/pull/15227

Right now the PR changes the patch calculation logic for every single field, which seems likely excessive.

crenshaw-dev avatar Dec 11 '23 20:12 crenshaw-dev

I'd be happy to contribute, but it's not immediately obvious to me which changes are required to "only" calculate JSON patches for a single field, while still use strategic merges for the rest. I have no prior experience to JSON patching/merging, so going from very little.

In an attempt to understand more about the error itself, I went down a bit of a rabbit hole to see where exactly the error gets raised, and what the execution path seems to be.

As follows (from bottom of stack, back to top):

  • The root error (unable to find api field in struct %s for the json field %q) is raised in LookupPatchMetadataForStruct, in the forked third party json package exported by k8s.io/apimachinery.
  • This is used in handleMapDiff function of the strategicpatch package, also exported by k8s.io/apimachinery.
  • handleMapDiff is called by various methods exposed by the strategicpatch package, but trickles up to CreateTwoWayMergePatch, which is called from the diff package found in github.com/argoproj/gitops-engine
  • diff.CreateTwoWayMergePatch is then the function called by the normalizeApplication method in appcontroller.go (which is where the changes from #15227 happen)

Has anyone been able to spin up a simple go test which reproduces this error? And similarly, a go test which differentiates between a JSON VS a two way merge patch, such that future changes can track that they aren't accidentally changing previously required behaviour? Both of these seem to be the "requirements" for getting a PR merged to fix this issue

PaulSonOfLars avatar Dec 12 '23 18:12 PaulSonOfLars

@PaulSonOfLars I've made the changes to use json for just the valuesObject. Like yourself, I too couldn't see how to treat a single field separately, so I'm doing two patches, and merging. The end result is... well, if you have time, please take a look on #15227.

Meanwhile, for folks that want to use this now, we worked around the issue by using https://github.com/upsidr/importer to keep Helm values in a separate file.

vladfr avatar Dec 13 '23 15:12 vladfr

it's not immediately obvious to me which changes are required to "only" calculate JSON patches for a single field

@PaulSonOfLars that's a very fair concern. Having seem the updated patch, I worry less about merging it because the impact surface area is smaller. But I still want to understand the issue more thoroughly, and I don't have time to do that just yet. If folks have time to investigate more, any additional findings would be appreciated.

crenshaw-dev avatar Dec 13 '23 16:12 crenshaw-dev

@PaulSonOfLars I've made the changes to use json for just the valuesObject. Like yourself, I too couldn't see how to treat a single field separately, so I'm doing two patches, and merging. The end result is... well, if you have time, please take a look on #15227.

I had considered the two-patch approach, but I was unsure how merging two different objects at different JSON paths would work (ie, one from the Application level and one from ...Helm.ValuesObject, if that makes sense?). It seems to work fine from your branch, so maybe it's good enough to do the app definitions like you do.

I've got some tests working off of your PR (and also trying my hand at a refactor, if you don't mind!), will let you know what I find!

PaulSonOfLars avatar Dec 13 '23 18:12 PaulSonOfLars

Yep, got some tests ready - will push later tonight,

Also adding support for Sources.Helm on top of Source.Helm; figure thats going to be the next issue report otherwise!

PaulSonOfLars avatar Dec 13 '23 18:12 PaulSonOfLars

Hi. any updates regarding the fix? it blocking us

alexvaigm avatar Dec 28 '23 13:12 alexvaigm

Hi @alexvaigm - Happy new year!

Latest update from my side is simply the PR I opened above (#16602), which should fix the issue + add tests. Friendly ping to @crenshaw-dev in case it fell off the radar after the festive season :)

PaulSonOfLars avatar Jan 02 '24 10:01 PaulSonOfLars

btw, if this isn't obvious to folks — a basic workaround here is to just not use valuesObject in the Application object, perhaps by using values (whose value is a string) instead. This might not work for all use cases, but eg, in our use case we were generating our Application object in an app-of-apps in a Helm chart, so switching the values object from being an object to a string in our chart was straightforward and fully avoids this issue.

glasser avatar Jan 03 '24 18:01 glasser

btw, if this isn't obvious to folks — a basic workaround here is to just not use valuesObject in the Application object, perhaps by using values (whose value is a string) instead. This might not work for all use cases, but eg, in our use case we were generating our Application object in an app-of-apps in a Helm chart, so switching the values object from being an object to a string in our chart was straightforward and fully avoids this issue.

what do you mean switch object to string? we are creating apps by argocd cli :

argocd app create <app name> .... --values-literal-file values.yaml

alexvaigm avatar Jan 03 '24 18:01 alexvaigm

I don't use the ArgoCD CLI: this is about writing an Application object in a YAML file (in my case, in a Helm chart template since I'm using an app-of-apps to create the apps that need values).

glasser avatar Jan 03 '24 18:01 glasser

(A glance at the ArgoCD CLI source code does suggest that it currently does always use valuesObject rather than values when used with --values-literal-file, yeah.)

glasser avatar Jan 03 '24 18:01 glasser

The generated app object looks like below. how it looks for you?

spec:
  destination:
    namespace:  test
    server: https://kubernetes.default.svc
  project: default
  source:
    chart:  test-chart
    helm:
      valuesObject:
        openshift: false
        apigw: true
.....

alexvaigm avatar Jan 03 '24 19:01 alexvaigm