pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

functionality similar to trigger_rule in airflow

Open deepio-oc opened this issue 6 years ago • 13 comments

Currently we have an airflow based pipeline where one of the step is using trigger_rule='all_done' argument. What it means is that the next step will run even though the parent step is failed.

Does Kubeflow have similar functionality? Functionality to trigger the next step based on certain conditions on parent(s) exit status?

This could be useful feature in Kubeflow to allow running steps even when parent step is failed as parent step may not be critical in the whole pipeline.

deepio-oc avatar Oct 11 '19 21:10 deepio-oc

Currently we have an airflow based pipeline where one of the step is using trigger_rule='all_done' argument. What it means is that the next step will run even though the parent step is failed.

Currently, Argo has very limited functionality for that (continueOn: Failed), so there is not much we can do right now.

Adding better support for dependencies to Argo would make it possible to do more. I'd add something like the following to Argo:

depends:
  or:
  - and: [completed: A, succeeded: B]
  - and: [completed: C, failed: D]

Another note is that Airflow does not have concept of data dependencies. If the upstream task fails, what will the downstream task get instead of the data it was receiving from the upstream?

Pipelines and Argo have support for exit handler - a task that runs after all pipeline tasks complete (succeed or fail). Would that suit your needs?

Ark-kun avatar Oct 11 '19 23:10 Ark-kun

Currently, Argo has very limited functionality for that (continueOn: Failed), so there is not much we can do right now.

So (countinueOn: Failed) is already supported in Argo ? If so, that may be sufficient for our current needs. can that be enabled in Kubeflow ?

Adding better support for dependencies to Argo would make it possible to do more. I'd add something like the following to Argo:

depends:
  or:
  - and: [completed: A, succeeded: B]
  - and: [completed: C, failed: D]

We will discuss this internally.

Another note is that Airflow does not have concept of data dependencies. If the upstream task fails, what will the downstream task get instead of the data it was receiving from the upstream?

I think there is a possibility that if a downstream task depends on multiple upstream and one of the upstream task may not be critical or doing anything to the data. In the case, downstream task may like to continue.

I believe better support of dependencies will enable wider use cases in Kubeflow.

Pipelines and Argo have support for exit handler - a task that runs after all pipeline tasks complete (succeed or fail). Would that suit your needs?

We plan to use exit handlers but I don't think it can replace our trigger_rule usage. if ContinueOn: failed is supported then it may be ok for our current needs.

Also there is some limitation with exit handlers. We needed exit handlers at the component level but we found that it is not supported and there is open bug on that.

deepio-oc avatar Oct 13 '19 14:10 deepio-oc

can that be enabled in Kubeflow ?

I'm not sure we're going to add that particular feature. The design seems to be error-prone (it affects all dependent tasks). It does not work with data passing. If dependent task declares the usage of the failed task outputs, it will fail at submission time.

To workaround the issue you can easily make any task not fail by wrapping it in a script wtaht swallows the error and creates the missing output files. That component can also have an output called "Succeeded" which will indicate whether the task has completed successfully.

Ark-kun avatar Oct 14 '19 18:10 Ark-kun

if a downstream step were to unblock when any of its parents completes without having to wait for all the parents to complete, I guess that can only be supported by the framework.

songole avatar Nov 08 '19 23:11 songole

can that be enabled in Kubeflow ?

I'm not sure we're going to add that particular feature. The design seems to be error-prone (it affects all dependent tasks). It does not work with data passing. If dependent task declares the usage of the failed task outputs, it will fail at submission time.

Wouldn't airflow have similar issues? They would have implemented it to address certain use cases.

We are willing to contribute if we have the support.

songole avatar Nov 08 '19 23:11 songole

/cc jingzhang36

rmgogogo avatar Nov 18 '19 07:11 rmgogogo

a downstream task depends on multiple upstream and one of the upstream task may not be critical or doing anything to the data

If the downstream does not depend on the data produced by the upstream, then should there be a dependency in the first place? I'd like to hear some scenarios.

a downstream step were to unblock when any of its parents completes without having to wait for all the parents to complete

How would the downstream task consume data from those parents? Can you please show a small mockup of how this would look like from syntax perspective?

Ark-kun avatar Jan 16 '20 02:01 Ark-kun

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 24 '20 19:06 stale[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

stale[bot] avatar Jul 02 '20 06:07 stale[bot]

/reopen I would be happy to have the same functionality. I've even tested that already by manual change of workflow.yaml:

    dag:
      tasks:
      - name: catch-fail
        template: catch-fail
        continueOn:
          failed: true

so I might try to propose the implementation looking into how dependencies are handled in the kfp code. Scenario: we have a pipeline containing a few steps, but some are "optional" which means if they fail it's acceptable for us. Sure that the best would be then to have pipeline finished with the status "Warning" instead of "Success" but still such an functionality is fine. We've implemented an workaround by using dsl.Condition but it's ugly and makes the stuff less readable.

wiktorof avatar Oct 15 '21 11:10 wiktorof

@wiktorof: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen I would be happy to have the same functionality. I've even tested that already by manual change of workflow.yaml:

   dag:
     tasks:
     - name: catch-fail
       template: catch-fail
       continueOn:
         failed: true

so I might try to propose the implementation looking into how dependencies are handled in the kfp code. Scenario: we have a pipeline containing a few steps, but some are "optional" which means if they fail it's acceptable for us. Sure that the best would be then to have pipeline finished with the status "Warning" instead of "Success" but still such an functionality is fine. We've implemented an workaround by using dsl.Condition but it's ugly and makes the stuff less readable.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-robot avatar Oct 15 '21 11:10 google-oss-robot

/reopen

It seems like there is still discussion to be had around allowing tasks to run, even if their parents fail.

I think most situations can be handled with an kfp.dsl.ExitHandler, which runs a single ContainerOp regardless of if the ContainerOps it wraps succeed or fail.

But we might consider making functionality like ExitHandler more "implicit" by having an airflow-style trigger_rule flag on the ContainerOp.

I also see these issues/PRs are related:

  • Issue: https://github.com/kubeflow/pipelines/issues/6749
  • Issue: https://github.com/kubeflow/pipelines/issues/7330
    • PR: https://github.com/kubeflow/pipelines/pull/7373

thesuperzapper avatar Mar 10 '22 06:03 thesuperzapper

@thesuperzapper: Reopened this issue.

In response to this:

/reopen

It seems like there is still discussion to be had around allowing tasks to run, even if their parents fail.

I think most situations can be handled with an kfp.dsl.ExitHandler, which runs a single ContainerOp regardless of if the ContainerOps it wraps succeed or fail.

But we might consider making functionality like ExitHandler more "implicit" by having an airflow-style trigger_rule flag on the ContainerOp.

I also see these issues/PRs are related:

  • Issue: https://github.com/kubeflow/pipelines/issues/6749
  • Issue: https://github.com/kubeflow/pipelines/issues/7330
    • PR: https://github.com/kubeflow/pipelines/pull/7373

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Mar 10 '22 06:03 google-oss-prow[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 08 '24 07:09 github-actions[bot]

This is still a critical issue if pipelines is meant to compete with airflow long-term.

thesuperzapper avatar Sep 08 '24 17:09 thesuperzapper

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 08 '24 07:11 github-actions[bot]

This issue is still important, if we seriously want to compete with airflow.

thesuperzapper avatar Nov 08 '24 17:11 thesuperzapper

This issue is still important, if we seriously want to compete with airflow.

thesuperzapper avatar Nov 08 '24 17:11 thesuperzapper

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jan 08 '25 07:01 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar Jan 29 '25 07:01 github-actions[bot]

/reopen

thesuperzapper avatar Jan 29 '25 17:01 thesuperzapper

@thesuperzapper: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Jan 29 '25 17:01 google-oss-prow[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 01 '25 07:04 github-actions[bot]