xgboost-operator
xgboost-operator copied to clipboard
How to run distributed training from Kubeflow Pipelines SDK?
The example linked in README https://github.com/kubeflow/xgboost-operator/tree/master/config/samples/xgboost-dist shows that spawning distributed training job requires running kubectl. I want to run distributed XGBoost training as a part of bigger Kubeflow pipeline, how to achieve this? Is there a possibility to spawn distributed job from the Python code itself or from the Kubeflow Pipelines SDK?
Issue-Label Bot is automatically applying the labels:
| Label | Probability |
|---|---|
| question | 0.86 |
Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.
I think you can run XGBoostJob as part of Kubeflow Pipelines similar to other Kubeflow operators but I am not familiar enough with Kubeflow Pipelines to be sure. Try it out and let us know if you encounter any issues.
I think it' related to https://github.com/kubeflow/pipelines/issues/973