functionality similar to trigger_rule in airflow
Currently we have an airflow based pipeline where one of the step is using trigger_rule='all_done' argument. What it means is that the next step will run even though the parent step is failed.
Does Kubeflow have similar functionality? Functionality to trigger the next step based on certain conditions on parent(s) exit status?
This could be useful feature in Kubeflow to allow running steps even when parent step is failed as parent step may not be critical in the whole pipeline.
Currently we have an airflow based pipeline where one of the step is using trigger_rule='all_done' argument. What it means is that the next step will run even though the parent step is failed.
Currently, Argo has very limited functionality for that (continueOn: Failed), so there is not much we can do right now.
Adding better support for dependencies to Argo would make it possible to do more. I'd add something like the following to Argo:
depends:
or:
- and: [completed: A, succeeded: B]
- and: [completed: C, failed: D]
Another note is that Airflow does not have concept of data dependencies. If the upstream task fails, what will the downstream task get instead of the data it was receiving from the upstream?
Pipelines and Argo have support for exit handler - a task that runs after all pipeline tasks complete (succeed or fail). Would that suit your needs?
Currently, Argo has very limited functionality for that (
continueOn: Failed), so there is not much we can do right now.
So (countinueOn: Failed) is already supported in Argo ? If so, that may be sufficient for our current needs. can that be enabled in Kubeflow ?
Adding better support for dependencies to Argo would make it possible to do more. I'd add something like the following to Argo:
depends: or: - and: [completed: A, succeeded: B] - and: [completed: C, failed: D]
We will discuss this internally.
Another note is that Airflow does not have concept of data dependencies. If the upstream task fails, what will the downstream task get instead of the data it was receiving from the upstream?
I think there is a possibility that if a downstream task depends on multiple upstream and one of the upstream task may not be critical or doing anything to the data. In the case, downstream task may like to continue.
I believe better support of dependencies will enable wider use cases in Kubeflow.
Pipelines and Argo have support for exit handler - a task that runs after all pipeline tasks complete (succeed or fail). Would that suit your needs?
We plan to use exit handlers but I don't think it can replace our trigger_rule usage. if ContinueOn: failed is supported then it may be ok for our current needs.
Also there is some limitation with exit handlers. We needed exit handlers at the component level but we found that it is not supported and there is open bug on that.
can that be enabled in Kubeflow ?
I'm not sure we're going to add that particular feature. The design seems to be error-prone (it affects all dependent tasks). It does not work with data passing. If dependent task declares the usage of the failed task outputs, it will fail at submission time.
To workaround the issue you can easily make any task not fail by wrapping it in a script wtaht swallows the error and creates the missing output files. That component can also have an output called "Succeeded" which will indicate whether the task has completed successfully.
if a downstream step were to unblock when any of its parents completes without having to wait for all the parents to complete, I guess that can only be supported by the framework.
can that be enabled in Kubeflow ?
I'm not sure we're going to add that particular feature. The design seems to be error-prone (it affects all dependent tasks). It does not work with data passing. If dependent task declares the usage of the failed task outputs, it will fail at submission time.
Wouldn't airflow have similar issues? They would have implemented it to address certain use cases.
We are willing to contribute if we have the support.
/cc jingzhang36
a downstream task depends on multiple upstream and one of the upstream task may not be critical or doing anything to the data
If the downstream does not depend on the data produced by the upstream, then should there be a dependency in the first place? I'd like to hear some scenarios.
a downstream step were to unblock when any of its parents completes without having to wait for all the parents to complete
How would the downstream task consume data from those parents? Can you please show a small mockup of how this would look like from syntax perspective?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
/reopen I would be happy to have the same functionality. I've even tested that already by manual change of workflow.yaml:
dag:
tasks:
- name: catch-fail
template: catch-fail
continueOn:
failed: true
so I might try to propose the implementation looking into how dependencies are handled in the kfp code. Scenario: we have a pipeline containing a few steps, but some are "optional" which means if they fail it's acceptable for us. Sure that the best would be then to have pipeline finished with the status "Warning" instead of "Success" but still such an functionality is fine. We've implemented an workaround by using dsl.Condition but it's ugly and makes the stuff less readable.
@wiktorof: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen I would be happy to have the same functionality. I've even tested that already by manual change of workflow.yaml:
dag: tasks: - name: catch-fail template: catch-fail continueOn: failed: trueso I might try to propose the implementation looking into how dependencies are handled in the kfp code. Scenario: we have a pipeline containing a few steps, but some are "optional" which means if they fail it's acceptable for us. Sure that the best would be then to have pipeline finished with the status "Warning" instead of "Success" but still such an functionality is fine. We've implemented an workaround by using dsl.Condition but it's ugly and makes the stuff less readable.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/reopen
It seems like there is still discussion to be had around allowing tasks to run, even if their parents fail.
I think most situations can be handled with an kfp.dsl.ExitHandler, which runs a single ContainerOp regardless of if the ContainerOps it wraps succeed or fail.
But we might consider making functionality like ExitHandler more "implicit" by having an airflow-style trigger_rule flag on the ContainerOp.
I also see these issues/PRs are related:
- Issue: https://github.com/kubeflow/pipelines/issues/6749
- Issue: https://github.com/kubeflow/pipelines/issues/7330
- PR: https://github.com/kubeflow/pipelines/pull/7373
@thesuperzapper: Reopened this issue.
In response to this:
/reopen
It seems like there is still discussion to be had around allowing tasks to run, even if their parents fail.
I think most situations can be handled with an
kfp.dsl.ExitHandler, which runs a single ContainerOp regardless of if the ContainerOps it wraps succeed or fail.But we might consider making functionality like ExitHandler more "implicit" by having an airflow-style
trigger_ruleflag on the ContainerOp.I also see these issues/PRs are related:
- Issue: https://github.com/kubeflow/pipelines/issues/6749
- Issue: https://github.com/kubeflow/pipelines/issues/7330
- PR: https://github.com/kubeflow/pipelines/pull/7373
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This is still a critical issue if pipelines is meant to compete with airflow long-term.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue is still important, if we seriously want to compete with airflow.
This issue is still important, if we seriously want to compete with airflow.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
/reopen
@thesuperzapper: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.