duplicate env variable ARGO_TEMPLATE result in etcd size 1MB limit
Summary
I am running a containerSet with lots of steps. And I have large parameter input for the step. Because there is an env variable called ARGO_TEMPLATE each container, the overall pod description is too large to fit in etcd, exceeding the 1MB limit. (10 + init + wait) = 12 containers 12 * 100k = 1200k > 1MB What is the usage of ARGO_TEMPLATE? Can we delete it?
I am using the v3.1.10
Message from the maintainers:
Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.
What is the usage of ARGO_TEMPLATE? Can we delete it?
No.
However, we could replace it with just the information needed by the emissary. The emissary only needs to know which containers it must wait on. It does not need the whole template.
That would be nice. Look forward to the coming improve!
The emissary only needs to know which containers it must wait on.
@alexec Looks like tests are breaking after I removed the parameters. https://github.com/argoproj/argo-workflows/pull/9698 Any hints on what I may have missed?
At Intuit we are actually relying on the $ARGO_TEMPLATE environment variable inputs because we had an issue in which we were originally trying to do this in our Step:
image: "docker.intuit.com/dev/patterns/kubernetes/dev/kubectl-awscli:v1.22.15"
command:
- sh
- -c
args:
- |
echo "{{`{{inputs.parameters.manifest}}`}}" | ....
but we found that we exceeded the maximum argument size limit allowed by Linux. So, we changed the Step to use the ARGO_TEMPLATE environment variable:
image: "docker.intuit.com/dev/patterns/kubernetes/dev/kubectl-awscli:v1.22.15"
command:
- sh
- -c
args: ["
getconf ARG_MAX;
echo $ARGO_TEMPLATE | yq '.inputs.parameters[0].value' --unwrapScalar > /tmp/manifest-64;
....
"]
(We had also tried an alternative solution of mounting the Input parameter as a RawArtifact but that faced an issue and didn't work either)
now perhaps there could be an alternative solution in which the value of the environment variable is compressed or something...
but we found that we exceeded the maximum argument size limit allowed by Linux. So, we changed the Step to use the ARGO_TEMPLATE environment variable:
Did this work for you? I'm having exact same issue.
https://github.com/argoproj/argo-workflows/blob/e7883c4fe849020b3a0503dc6dd2a7a9c911c386/workflow/executor/emissary/emissary.go#L29
I think ARGO_TEMPLATE is needed only for init container, since it writes a JSON encoding of the template to /var/run/argo/template.
wait and other main containers can obtain the template from /var/run/argo/template as below:
https://github.com/argoproj/argo-workflows/blob/e7883c4fe849020b3a0503dc6dd2a7a9c911c386/cmd/argoexec/commands/emissary.go#L64
cc @tooptoop4
what about https://github.com/argoproj/argo-workflows/blob/3652241a42be8e1cc699719f97ff7af2c7ca86c5/workflow/controller/workflowpod.go#L491-L493 using Inputs? @jswxstw
I don't understand, what do you mean? @tooptoop4
@jswxstw the code i linked seems to take inputs from the template, are u saying have another variable for that purpose not ARGO_TEMPLATE?
@tooptoop4 No, I don't quite understand your question and how it is related to this issue.
I actually meant that I think env variable ARGO_TEMPLATE should only be set in init container, and other containers should read it from the file /var/run/argo/template if needed.
Additionally, inputs.parameters in ARGO TEMPLATE seems useless, maybe we can just keep inputs.artifacts, but I'm not sure with that.
#12325 will offload the env ARGO_TEMPLATE to configmap when ARGO_TEMPLATE is larger than 128KB, which can solve this issue to some extent. @tooptoop4
But I think #12325 still has some areas for improvement:
- Removing
inputs.parametersfromARGO_TEMPLATEcan significantly reduce its size. -
ARGO_TEMPLATEin configmap will be mount to/argo/configifoffloadEnvVarTemplateis enabled andinitcontainer will also writeARGO_TEMPLATEto/var/run/argo/template, which is duplicated.
I don't have any permissions to operate issues. Is there something wrong? @agilgur5
I don't have any permissions to operate issues. Is there something wrong? @agilgur5
You pop up correctly as a "Member". Screenshot:
So you should have "triage" permissions.
But looking at the Argoproj teams, it looks like you're not part of the "members" team for some reason 👀 I'm guessing that's the team assigned to triage per repo or something. @terrytangyuan could you add @jswxstw to the "members" team? (I don't have permissions to do so, org owners only). Might want to double-check other members too as there is a diff between that team and the "people" list (some of those are seemingly correct diffs, like bots, at least a few others seem to incorrectly have 0 teams) EDIT: I also asked in the Approvers Slack channel (private link) if some org admins could take a look at this diff
Added
I actually meant that I think env variable
ARGO_TEMPLATEshould only be set ininitcontainer, and other containers should read it from the file/var/run/argo/templateif needed.
This part wasn't solved by #13742, so it sounds like this issue should still be open