Training: Add documentation for the MultiKueue and spec.managedBy API
Hi @Garvit-77. Thanks for your PR.
I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@andreyvelich please review the PR and let me know if any changes are expected
Please help us with the review ack
FYI, I think the feature will be almost completely a technical detail hidden from the users, as we are going to default the field in MultiKueue by a webhook once https://github.com/kubernetes-sigs/kueue/issues/2552 is done. So, I think describing the field is ok, but eventually a "plain" TFJob is what the user yaml contains. So, I think we will be able to reference to the MultiKueue docs for example.
FYI we already have a note about this in Kueue in the MD file: https://github.com/kubernetes-sigs/kueue/blob/main/site/content/en/docs/tasks/run/multikueue/kubeflow.md. The actual user-facing documentation in https://kueue.sigs.k8s.io/docs/ link will be available when we release 0.11, which is planned for March 17th.
he location of the field is changed between training-operator 1.9.0 and the new trainer. Trainer is still not supported though. Is this documentation page meant for 1.9.0 or the new trainer, or both? If both, should we add a note that the snippet presents the yaml only for training-operator 1.9.0?
These docs should be placed under legacy guide for Kubeflow Training Operator 1.9 https://www.kubeflow.org/docs/components/trainer/legacy-v1/user-guides/
@Garvit-77 Please can you rebase this PR, so you have the correct location for the guide.
Hey @andreyvelich just rebased this on github itself and made changes had some issues with gpg while rebase I hope it still marks the requirement
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: andreyvelich
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~content/en/docs/components/trainer/OWNERS~~ [andreyvelich]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment