community
community copied to clipboard
Proposal: Donate the MPI-Operator.V2 to kubernetes-sigs
Signed-off-by: Carlos Eduardo Arango Gutierrez [email protected]
/cc @alculquicondor
https://github.com/kubeflow/mpi-operator/issues/459
@kubeflow/project-steering-group
/assign @bobgy
/assign @theadactyl
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: ArangoGutierrez To complete the pull request process, please ask for approval from bobgy after the PR has been reviewed.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Thanks for opening this! I don't see anywhere an assessment of how this impacts existing users of MPI operator and/or the unified operator, and what the implications are for unified operator going forward. Also how this might impact other parts of Kubeflow. I think that's important to include -- can you do so?
Additionally, what in particular is the value of donation here, vs just maintaining MPI operator within Kubeflow with fewer dependencies that would block HPC users? Donation isn't a lightweight process or decision, so interested to see what the specific advantages/disadvantages are there so we can call out and validate any assumptions being made here.
Also, when there are a couple more updates, I'd like to get feedback from the community. This would benefit from being sent to kubeflow-discuss & having a spot for discussion in an upcoming Community Meeting.
How does that sound?
I would like to see a more detailed proposal for the migration plan. Specifically:
- How do we avoid having two divergent versions of the MpiJob?
- Assuming that the new MpiJob replaces the current version, how do we handle installation of the new MPI operator?
- How do we handle common dependencies?
- Will the new MpiJob API be backward compatible?
- What will be the release and versioning plan going forward?
HI @ArangoGutierrez Any new process?
Given https://github.com/cncf/toc/pull/950 let's /hold
I think now I have time to get back to this thread PING @theadactyl @alculquicondor @richardsliu What would it take to revive this thread
While this is still open, it would be a lot of effort from a copyright perspective https://github.com/cncf/toc/pull/950
So I think we should wait.
I'd recommend waiting on CNCF donation review unless there is something being significantly blocked here.
On Thu, Feb 9, 2023 at 6:06 AM Aldo Culquicondor @.***> wrote:
While this is still open, it would be a lot of effort from a copyright perspective cncf/toc#950 https://github.com/cncf/toc/pull/950
So I think we should wait.
— Reply to this email directly, view it on GitHub https://github.com/kubeflow/community/pull/557#issuecomment-1424243828, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABREJVK2WCFRXO3OU3ZK633WWT2VZANCNFSM5QQ4NJCA . You are receiving this because you were mentioned.Message ID: @.***>
cc
https://github.com/cncf/toc/pull/1042
I guess we can restart this effort?
I think we can rather drop it.
Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.
@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?
I think we can rather drop it.
Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.
@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?
Agree /Close
@ArangoGutierrez: Closed this PR.
In response to this:
I think we can rather drop it.
Now that kubeflow is part of CNCF, it's easier for organizations to contribute to the project.
@ArangoGutierrez do you have an additional motivation to make this a kubernetes subproject?
Agree /Close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
If this is the case, can we please make some effort to update the documentation on the Kubeflow website to reflect that there are now two separate components:
- The unified training operator: https://github.com/kubeflow/training-operator
- The MPI Operator: https://github.com/kubeflow/mpi-operator
We need to update and split the following component page: https://www.kubeflow.org/docs/components/training/