fix: respect `workflowDefaults` when validating workflows. Fixes #10946. Fixes #11465
Fixes #10946 and Fixes #11465
Motivation
Submitting a workflow with a variable reference that points to a label or annotation that comes from workflowDefaults will always fail with an error like the following:
msg="Failed to submit workflow: rpc error: code = InvalidArgument desc = templates.whalesay-template: failed to resolve {{workflow.labels.arg-name}}
Modifications
This error is happening here: https://github.com/argoproj/argo-workflows/blob/2dac1266b446d059ac6df78afe3d09a6ddb3af4b/workflow/validate/validate.go#L705
The scope argument comes from ValidateWorkflow(), which isn't aware of workflowDefaults, and therefore doesn't include any labels/annotations defined therein. This is purely a validation issue, since workflowDefaults is merged into the workflow at runtime by
https://github.com/argoproj/argo-workflows/blob/2dac1266b446d059ac6df78afe3d09a6ddb3af4b/workflow/controller/controller.go#L1375
This updates ValidateWorkflow() to take in the workflowDefaults (if any) and use that to populate the context with any labels/annotations, which is ultimately used to populate the scope argument. Most of the changes in this PR is just passing around the workflowDefaults so that it can ultimately get passed to ValidateWorkflow().
Verification
I opted to modify the existing TestSubmitWorkflowTemplateWorkflowMetadataSubstitution E2E test to verify this is working. I could've added a new test, but considering how slow the CI pipeline already is, that seems suboptimal.
$ make TestSubmitWorkflowTemplateWorkflowMetadataSubstitution
<SNIP>
Checking expectation workflow-template-submittable-t8f5r
workflow-template-submittable-t8f5r : Succeeded
Found node id=workflow-template-submittable-t8f5r type=Pod
=== PASS: WorkflowTemplateSuite/TestSubmitWorkflowTemplateWorkflowMetadataSubstitution
--- PASS: TestWorkflowTemplateSuite (8.24s)
--- PASS: TestWorkflowTemplateSuite/TestSubmitWorkflowTemplateWorkflowMetadataSubstitution (8.24s)
=== RUN TestWorkflowSuite
--- PASS: TestWorkflowSuite (0.00s)
PASS
ok github.com/argoproj/argo-workflows/v3/test/e2e 38.532s
Test failures seem due to flakiness. CI / E2E Tests (test-executor, minimal, false) (pull_request) is failing with the same error on the main branch: https://github.com/argoproj/argo-workflows/actions/runs/10969440498/job/30462136331
For the failure in CI / E2E Tests (test-api, mysql, true) (pull_request) , I verified make TestCronCountersReplace passes locally:
=== PASS: MetricsSuite/TestCronCountersReplace
=== SLOW TEST: MetricsSuite/TestCronCountersReplace took 2m1s
--- PASS: TestMetricsSuite (121.10s)
--- PASS: TestMetricsSuite/TestCronCountersReplace (121.09s)
=== RUN TestPodCleanupSuite
--- PASS: TestPodCleanupSuite (0.01s)
=== RUN TestProgressSuite
--- PASS: TestProgressSuite (0.01s)
=== RUN TestResourceTemplateSuite
--- PASS: TestResourceTemplateSuite (0.01s)
=== RUN TestRetrySuite
--- PASS: TestRetrySuite (0.01s)
=== RUN TestRunAsNonRootSuite
--- PASS: TestRunAsNonRootSuite (0.01s)
=== RUN TestSemaphoreSuite
--- PASS: TestSemaphoreSuite (0.01s)
=== RUN TestSignalsSuite
--- PASS: TestSignalsSuite (0.01s)
=== RUN TestConfigMapKeySelectorSubstitutionSuite
--- PASS: TestConfigMapKeySelectorSubstitutionSuite (0.00s)
=== RUN TestWorkflowInputsOverridableSuiteSuite
--- PASS: TestWorkflowInputsOverridableSuiteSuite (0.00s)
=== RUN TestWorkflowTemplateSuite
--- PASS: TestWorkflowTemplateSuite (0.00s)
=== RUN TestWorkflowSuite
--- PASS: TestWorkflowSuite (0.00s)
PASS
ok github.com/argoproj/argo-workflows/v3/test/e2e 124.559s
Test failures are unrelated and should go away when https://github.com/argoproj/argo-workflows/pull/13660 is merged