metaflow
metaflow copied to clipboard
Add support for Kueue.
This commit adds support for using Kueue to submit jobs/pods into Kubernetes. There are two config options:
- KUEUE_ENABLED: set to True/False
- KUEUE_LOCALQUEUE_NAME: set to the name of the localqueue configured with Kueue. See this for details
The config options can be set in the main metaflow config or via the @kubernetes decorator.
Testing Done:
-
Verified that specifying kueue config options in Metaflow config (~/.metaflowconfig/json) works as expected.
-
Verified that specifying kueue config options in @kubernetes works as expected
-
Verified that @kubernetes options take precedence over the global config
- If the global KUEUE_ENABLED config is True, but locally set to False for a particular step, the step does not run with Kueue.
-
Verified that the kueue labels and annotations are set correctly and kueue actually runs the jobs.
-
Verified that if kueue is configured to manage "pod", Metaflow create argo-workflow pods are scheduled by kueue.
-
Verified that the default behavior is to not use Kueue and everything works correctly as before (jobs and argo-workflows)
Mergeable anytime from my end -- no impact on core.
I'm interested in this PR - is it actually going to happen or dead in the water?
Hi there, I would be interested as well