training-operator icon indicating copy to clipboard operation
training-operator copied to clipboard

Add Helm Charts for Kubeflow Trainer V2

Open gaocegege opened this issue 5 years ago • 23 comments

gaocegege avatar Dec 11 '20 10:12 gaocegege

I would like to work on this issue. Since I'm new here, can someone guide me?

sanjeepan23 avatar Dec 11 '20 14:12 sanjeepan23

Thanks, @goat023

You can have a look at other operator's helm chart, like tidb-operator or etcd-operator. It should be similar to them.

gaocegege avatar Dec 12 '20 07:12 gaocegege

There's also helm chart for mpi-operator for reference but probably needs some update as well: https://github.com/kubeflow/mpi-operator/tree/master/hack/helm/mpi-operator

terrytangyuan avatar Dec 12 '20 20:12 terrytangyuan

Thanks, I'll try

sanjeepan23 avatar Dec 13 '20 03:12 sanjeepan23

where can I get the maintainer's name and email address for this repo

sanjeepan23 avatar Dec 13 '20 04:12 sanjeepan23

https://github.com/kubeflow/tf-operator/blob/master/OWNERS

gaocegege avatar Dec 14 '20 02:12 gaocegege

@goat023 Hi, is there any progress?

gaocegege avatar Jan 18 '21 06:01 gaocegege

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 02 '21 17:06 stale[bot]

Do you still need one?

ehudyonasi avatar Jun 15 '22 12:06 ehudyonasi

Inorder to ensure that helm charts are in sync, we can add one extra check in Github Actions.

johnugeorge avatar Jul 18 '22 05:07 johnugeorge

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 04 '23 20:10 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar Oct 24 '23 20:10 github-actions[bot]

More and more Kubernetes projects are moved toward Helm charts for the deployment:

  • Kueue: https://github.com/kubernetes-sigs/kueue/tree/main/charts/kueue
  • JobSet: https://github.com/kubernetes-sigs/jobset/issues/726

I think, we should consider to create Helm Chart for at least Kubeflow Training V2 WDYT @kubeflow/wg-training-leads @ChenYi015 ?

andreyvelich avatar Dec 19 '24 22:12 andreyvelich

/remove-lifecycle stale

andreyvelich avatar Dec 19 '24 22:12 andreyvelich

I strongly agree with adding a Helm chart for the training operator, and I am willing to maintain it, drawing on my experience with the Spark operator Helm chart.

ChenYi015 avatar Dec 20 '24 14:12 ChenYi015

Great, thank you @ChenYi015! I would also suggest to take a look at the sync script that @tenzen-y uses for Kueue project: https://github.com/kubeflow/training-operator/pull/2263#issuecomment-2407994501

andreyvelich avatar Dec 20 '24 14:12 andreyvelich

+1 as long as we can sustain the maintenance

terrytangyuan avatar Dec 22 '24 16:12 terrytangyuan

Let's include this feature as part of Release 2.0, so users deploy Kubeflow Training V2 with Helm Charts. @ChenYi015 Do you have bandwidth to work on this and review @tenzen-y script to keep Helm Charts and Kustomize manifests in sync: https://github.com/kubernetes-sigs/kueue/tree/main/hack ?

andreyvelich avatar Jan 08 '25 18:01 andreyvelich

Do you have bandwidth to work on this and review @tenzen-y script to keep Helm Charts and Kustomize manifests in sync: https://github.com/kubernetes-sigs/kueue/tree/main/hack ?

I will do it when I have time.

ChenYi015 avatar Jan 09 '25 11:01 ChenYi015

Hi @ChenYi015, it would be nice if we could prepare Helm Charts for Kubeflow Trainer before we release the first version. Do you think you still have time to help us or we should ask community for the help ?

andreyvelich avatar Feb 11 '25 20:02 andreyvelich

@andreyvelich Ok, I will update the helm charts ASAP. When are we going to release the first version of V2?

ChenYi015 avatar Feb 12 '25 02:02 ChenYi015

@ChenYi015 We are planning to release V2 right after we implement the MPI Plugin: https://github.com/kubeflow/trainer/pull/2394

andreyvelich avatar Feb 12 '25 15:02 andreyvelich

/assign @ChenYi015 cc @andreyvelich

varodrig avatar Feb 14 '25 03:02 varodrig