argo-workflows icon indicating copy to clipboard operation
argo-workflows copied to clipboard

workflow not moving to archive

Open tooptoop4 opened this issue 1 year ago • 4 comments

Pre-requisites

  • [X] I have double-checked my configuration
  • [X] I can confirm the issues exists when I tested with :latest
  • [ ] I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

i have:

        ttlStrategy:
          secondsAfterFailure: 240
          secondsAfterSuccess: 180

but my workflow did not get removed, it still remains on the workflows ui

Version

3.4.0

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

n/a

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

time="2022-09-21T05:13:00.032Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.039Z" level=info msg="Updated phase  -> Running" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.042Z" level=info msg="Retry node rs-grants-1663737180 initialized Running" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.042Z" level=info msg="Pod node rs-grants-1663737180-1835383106 initialized Pending" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.138Z" level=info msg="Created pod: rs-grants-1663737180(0) (rs-grants-1663737180-main-1835383106)" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.138Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.138Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:00.172Z" level=info msg="Workflow update successful" namespace=auth phase=Running resourceVersion=97644443 workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.033Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.034Z" level=info msg="Task-result reconciliation" namespace=auth numObjs=0 workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.034Z" level=info msg="node changed" namespace=auth new.message= new.phase=Running new.progress=0/1 nodeID=rs-grants-1663737180-1835383106 old.message= old.phase=Pending old.progress=0/1 workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.035Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.035Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:10.065Z" level=info msg="Workflow update successful" namespace=auth phase=Running resourceVersion=97644522 workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.688Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.690Z" level=info msg="Task-result reconciliation" namespace=auth numObjs=0 workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.690Z" level=info msg="node changed" namespace=auth new.message="Error (exit code 1): failed to create new S3 client: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.amazonaws.com/\": Service Unavailable" new.phase=Error new.progress=0/1 nodeID=rs-grants-1663737180-1835383106 old.message= old.phase=Running old.progress=0/1 workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.691Z" level=info msg="node has maxDuration set, setting executionDeadline to: Wed Sep 21 05:14:00 +0000 (27 seconds from now)" namespace=auth node=rs-grants-1663737180 workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.691Z" level=info msg="1 child nodes of rs-grants-1663737180 failed. Trying again..." namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.693Z" level=info msg="Pod node rs-grants-1663737180-1231241727 initialized Pending" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.754Z" level=info msg="Created pod: rs-grants-1663737180(1) (rs-grants-1663737180-main-1231241727)" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.754Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.754Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:32.803Z" level=info msg="Workflow update successful" namespace=auth phase=Running resourceVersion=97644668 workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.760Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.762Z" level=info msg="Task-result reconciliation" namespace=auth numObjs=0 workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.762Z" level=info msg="node unchanged" namespace=auth nodeID=rs-grants-1663737180-1835383106 workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.762Z" level=info msg="node changed" namespace=auth new.message= new.phase=Running new.progress=0/1 nodeID=rs-grants-1663737180-1231241727 old.message= old.phase=Pending old.progress=0/1 workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.763Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.763Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:13:42.783Z" level=info msg="Workflow update successful" namespace=auth phase=Running resourceVersion=97644743 workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.472Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.473Z" level=info msg="Task-result reconciliation" namespace=auth numObjs=0 workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.473Z" level=info msg="node changed" namespace=auth new.message= new.phase=Running new.progress=0/1 nodeID=rs-grants-1663737180-1231241727 old.message= old.phase=Running old.progress=0/1 workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.473Z" level=info msg="node unchanged" namespace=auth nodeID=rs-grants-1663737180-1835383106 workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.474Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.474Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.478Z" level=info msg="cleaning up pod" action=terminateContainers key=auth/rs-grants-1663737180-main-1231241727/terminateContainers
time="2022-09-21T05:14:05.479Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/auth/pods/rs-grants-1663737180-main-1231241727/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2022-09-21T05:14:05.518Z" level=info msg="Workflow update successful" namespace=auth phase=Running resourceVersion=97644884 workflow=rs-grants-1663737180
time="2022-09-21T05:14:05.669Z" level=info msg="signaled container" container=wait error="<nil>" namespace=auth pod=rs-grants-1663737180-main-1231241727 stderr= stdout="killing 1 with terminated\n"
time="2022-09-21T05:14:17.679Z" level=info msg="Processing workflow" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.681Z" level=info msg="Task-result reconciliation" namespace=auth numObjs=0 workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.681Z" level=info msg="node unchanged" namespace=auth nodeID=rs-grants-1663737180-1835383106 workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.681Z" level=info msg="node changed" namespace=auth new.message="Error (exit code 1): failed to create new S3 client: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.amazonaws.com/\": Service Unavailable" new.phase=Error new.progress=0/1 nodeID=rs-grants-1663737180-1231241727 old.message= old.phase=Running old.progress=0/1 workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Max duration limit exceeded. Failing..." namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="node rs-grants-1663737180 phase Running -> Error" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="node rs-grants-1663737180 message: Max duration limit exceeded" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="node rs-grants-1663737180 finished: 2022-09-21 05:14:17.682271077 +0000 UTC" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="TaskSet Reconciliation" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg=reconcileAgentPod namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Updated phase Running -> Error" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Updated message  -> Max duration limit exceeded" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Marking workflow completed" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Marking workflow as pending archiving" namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.682Z" level=info msg="Checking daemoned children of " namespace=auth workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.688Z" level=info msg="cleaning up pod" action=deletePod key=auth/rs-grants-1663737180-1340600742-agent/deletePod
time="2022-09-21T05:14:17.712Z" level=info msg="Workflow update successful" namespace=auth phase=Error resourceVersion=97644960 workflow=rs-grants-1663737180
time="2022-09-21T05:14:17.731Z" level=info msg="archiving workflow" namespace=auth uid=075a2d59-a01a-4740-96d2-0d8bb22f94b7 workflow=rs-grants-1663737180
time="2022-09-21T05:14:22.732Z" level=info msg="cleaning up pod" action=deletePod key=auth/rs-grants-1663737180-main-1835383106/deletePod
time="2022-09-21T05:14:22.732Z" level=info msg="cleaning up pod" action=deletePod key=auth/rs-grants-1663737180-main-1231241727/deletePod
time="2022-09-21T05:14:35.669Z" level=info msg="cleaning up pod" action=killContainers key=auth/rs-grants-1663737180-main-1231241727/killContainers

Logs from in your workflow's wait container

kubectl logs -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

tooptoop4 avatar Sep 21 '22 07:09 tooptoop4

@tooptoop4 Will This work on the previous version?

Is your workflow archive working fine? Can you check workflow-controller log or Archive section in UI? If Workflow Archive is configured for workflow, the Controller will honor TTL once the workflow got archived.

sarabala1979 avatar Sep 23 '22 23:09 sarabala1979

@sarabala1979 my archive is working fine for other workflows, I think this intermittent WebIdentityErr error on 3.3.9 also caused archive to fail

tooptoop4 avatar Sep 24 '22 02:09 tooptoop4

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If this is a mentoring request, please provide an update here. Thank you for your contributions.

stale[bot] avatar Oct 15 '22 22:10 stale[bot]

crispy

tooptoop4 avatar Oct 20 '22 10:10 tooptoop4