kargo icon indicating copy to clipboard operation
kargo copied to clipboard

documentation: need to document analysis templates better -- mainly with relevant links to rollouts docs

Open WZBFLM opened this issue 1 year ago • 8 comments

Checklist

  • [x] I've searched the issue queue to verify this is not a duplicate bug report.
  • [x] I've included steps to reproduce the bug.
  • [x] I've pasted the output of kargo version.
  • [x] I've pasted logs, if applicable.

Description

In Kargo 0.7.1 the handover of stage verification.args[] to the actual analysis run stopped working. Tested this in 0.3.0 successfully but now when re-evaluating for productive use, this feature seems to stopped working.

Steps to Reproduce

apiVersion: kargo.akuity.io/v1alpha1
kind: Stage
metadata:
  name: dev
  namespace: kargo-demo-helm
spec:
...
...
  verification:
    analysisTemplates:
    - name: kargo-demo-helm
    analysisRunMetadata:
      labels:
        app: kargo-demo-helm
      annotations:
        bat: baz4
    args:
    - name: exit-code
      value: "0"    
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: kargo-demo-helm
  namespace: kargo-demo-helm
spec:
  metrics:
  - name: fail-or-pass
    provider:
      job:
        spec:
          template:
            spec:
              containers:
              - name: sleep
                image: alpine:latest
                command: [sh, -c]
                args:
                - exit {{args.exit-code}}
              restartPolicy: Never
          backoffLimit: 1      

After trigger of promotion:

apiVersion: argoproj.io/v1alpha1
kind: AnalysisRun
metadata:
  annotations:
    bat: baz4
  creationTimestamp: "2024-07-23T11:06:42Z"
  generation: 2
  labels:
    app: kargo-demo-helm
    kargo.akuity.io/freight: 37903f475b44b5d801dbffa3e35c2dac9a363d27
    kargo.akuity.io/promotion: dev.01j3fmb8dh5pwhq2pc00epcxdq.37903f4
    kargo.akuity.io/stage: dev
...
...
status:
  message: 'Unable to resolve metric arguments: failed to resolve {{args.exit-code}}'
  phase: Error
  runSummary: {}

This test passed in 0.3.0 version

Version

v0.7.1

Logs

From rollouts-controller:

time="2024-07-23T11:06:42Z" level=warning msg="Unable to resolve metric arguments: failed to resolve {{args.exit-code}}" analysisrun=dev.01j3fmb9ng4qq33a2c314em3ts.37903f4 namespace=kargo-demo-helm

WZBFLM avatar Jul 23 '24 11:07 WZBFLM

I wonder if this is somehow related to #1577, which was added in 0.5.0

hiddeco avatar Jul 23 '24 17:07 hiddeco

@hiddeco that seems a good hypothesis. I was going to dig into this, and still plan to, but if you want to look closer, please feel free.

krancour avatar Jul 23 '24 23:07 krancour

The solution is to ensure the AnalysisTemplate contains the definition of the argument you are attempting to use:

---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: kargo-demo-helm
spec:
  args:
  - name: exit-code
  metrics:
  - name: fail-or-pass
    provider:
      job:
        spec:
          template:
            spec:
              containers:
              - name: sleep
                image: alpine:latest
                command: [sh, -c]
                args:
                - exit {{args.exit-code}}
              restartPolicy: Never
          backoffLimit: 1

This subtle change is required because the bug fix in #1577 now uses the arguments from the AnalysisTemplate as a (default) base to resolve #1573. This implicitly means that arguments must be defined within the AnalysisTemplate for values (overrides) from the Stage to be taken into account.

hiddeco avatar Jul 24 '24 10:07 hiddeco

This implicitly means that arguments must be defined within the AnalysisTemplate for values (overrides) from the Stage to be taken into account.

And that's how analyses work natively in Rollouts, is it not?

i.e. This is a case of "you were dependent on a bug and we fixed the bug?"

krancour avatar Jul 24 '24 13:07 krancour

Yes, based on my current understanding that's correct. You can't implicitly add new variables, so to say.

hiddeco avatar Jul 24 '24 14:07 hiddeco

I will re-test today and give feedback. Thanks so much for the quick response.

WZBFLM avatar Jul 24 '24 14:07 WZBFLM

Hi guys, can confirm: By defining the parameters in the AnalysisTemplate it works. Also from overall pattern it make sense. Any chance I can contribute some documentation on this?

WZBFLM avatar Jul 24 '24 15:07 WZBFLM

@WZBFLM that would be great. There is a Verifications section in docs/docs/15-concepts.md.

One thing I don't want to do is go overboard repeating things that are already covered by Rollouts docs (as long as we're linking to the relevant pages), but adding note or info blocks to call out especially important details would be great.

krancour avatar Jul 24 '24 16:07 krancour