common
common copied to clipboard
feat: Add successpolicy
SuccessPolicy is used in both PyTorchJob and TFJob. Thus I propose to add it in common.
Signed-off-by: cegao [email protected]
/assign @terrytangyuan @Jeffwan @zw0610
It is used in Katib. I think it works, but I think we should support successPolicy to keep API consistency.
cc @andreyvelich
Yes, we are using successCondition with GSON format in our APIs to define condition for Katib Trial's Workers, similar to Argo Workflows.
Probably, we can think how to use it in Training Operators.
Let's use https://github.com/kubeflow/training-operator/issues/1507 to track and discuss separately.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: terrytangyuan
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [terrytangyuan]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/assign @zw0610 @Jeffwan
LGTM. Meanwhile, could you add descriptions for SchedulingPolicy and SuccessPolicyAllWorkers to explain the expected behavior?
SGTM
/hold