argo-workflows icon indicating copy to clipboard operation
argo-workflows copied to clipboard

retryStrategy step/DAG task scoped

Open tvalasek opened this issue 4 years ago • 14 comments

Summary

Have retryStrategy step/DAG task scoped rather than template scoped.

Motivation

Every step/DAG task using same template may require different retryStrategy .

Proposal

N/A


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

tvalasek avatar Apr 18 '20 21:04 tvalasek

Is this not the same as #1891 ?

markterm avatar Apr 22 '20 08:04 markterm

to my understanding #1891 is "set of steps" scoped. This one wants concrete step/DAG task scoped

tvalasek avatar Apr 22 '20 15:04 tvalasek

I brought this up during an internal meeting, we agreed that this might be a desirable feature but because of how our controller is structured might be more difficulty than it's worth. I'm going to work on a PoC and if there are no issues we can move forward with this.

simster7 avatar Apr 22 '20 17:04 simster7

was this feature iceboxed ?

tvalasek avatar Jun 05 '20 14:06 tvalasek

was this feature iceboxed ?

For now it is. The reason has mostly to do with design: we didn't want to start blurring the template definition/template caller abstraction for now. Keeping the issue open though as we might revisit this in the future.

Please feel free to add your thoughts.

simster7 avatar Jun 05 '20 15:06 simster7

Please feel free to add your thoughts.

I know that one suggested approach was to use templatePatch and be able to patch all template fields. This feature request is not asking for all that, just one feature and that is retryStrategy.

Still not feasible? I saw you made it to work https://github.com/argoproj/argo/pull/2894/commits/1767952f26941804cf55cd3b87ba18bc538607b9

Thanks

tvalasek avatar Jun 05 '20 15:06 tvalasek

I think retryStrategy is already supported in DAG/Step scope.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: retry-with-steps-
spec:
  entrypoint: retry-with-steps
  templates:
  - name: retry-with-steps
    retryStrategy:
      limit: 10
    steps:
    - - name: hello1
        template: random-fail
    - - name: hello2a
        template: random-fail
      - name: hello2b
        template: random-fail
  - name: random-fail
    container:
      image: python:alpine3.6
      command: [python, -c]
      # fail with a 66% probability
      args: ["import random; import sys; print('retries: {{retries}}'); exit_code = random.choice([0, 1, 1]); sys.exit(exit_code)"]

sarabala1979 avatar Jun 05 '20 16:06 sarabala1979

@sarabala1979 This issue refers to overriding retryStrategy in a step caller:

  templates:
  - name: retry-with-steps
    steps:
    - - name: hello1
        template: random-fail
        retryStrategy:
          limit: 10
    - - name: hello2a
        template: random-fail
      - name: hello2b
        template: random-fail
  - name: random-fail

simster7 avatar Jun 05 '20 16:06 simster7

I think retryStrategy is already supported in DAG/Step scope.

that is template scoped. what this FR is asking for is https://github.com/argoproj/argo/commit/1767952f26941804cf55cd3b87ba18bc538607b9#diff-ac16b3112943294b0702cf25461e0ee7

tvalasek avatar Jun 05 '20 16:06 tvalasek

Related to #3965

alexec avatar Sep 16 '20 19:09 alexec

that is template scoped.

A DAG is a template?

alexec avatar Sep 16 '20 19:09 alexec

for steps this shows exactly what is asked: https://github.com/argoproj/argo/commit/1767952f26941804cf55cd3b87ba18bc538607b9#diff-ac16b3112943294b0702cf25461e0ee7

for DAG something very similar, like so:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: retry-override-
spec:
  entrypoint: steps
  templates:
    - name: some-dag
      dag:
        tasks:
        - name: fail-0-retry
          template: fail
          retryStrategy:
            limit: 0
        - name: fail-1-retry
          template: fail
        - name: fail-2-retry
          template: fail
          retryStrategy:
            limit: 2

    - name: fail
      retryStrategy:
        limit: 1
      container:
        image: alpine
        command: [sh, -c]
        args: ["exit 1"]

tvalasek avatar Sep 18 '20 17:09 tvalasek

@tvalasek in v2.12 (#3965) we'll be introducing workflow-level retry strategy. Would this meet your use case?

alexec avatar Sep 18 '20 17:09 alexec

:) i would still prefer step/dag task scoped but if is it too much of an ask, workflow-level retry is OK too, as long as you mean workflowtemplate-level too :) Thanks

tvalasek avatar Sep 18 '20 17:09 tvalasek