Support Volcano scheduler integration by enabling PodGroup creation for podTask in Propeller
Why are the changes needed?
Support Volcano scheduler integration by enabling PodGroup creation for podTask in Propeller. Volcano is a Kubernetes-native batch scheduler optimized for high-performance, AI/ML, and big data workloads.
Before this change, all podTasks associated with a FlyteWorkflow shared a single PodGroup, which was automatically created just once by the Volcano controller when the first podTask was launched. Subsequent podTasks reused this PodGroup, which introduced limitations. Specifically, different podTasks could not be scheduled with different Volcano queues or job priorities, since the shared PodGroup enforced a single queue and priority for all.
This change enables Propeller to create individual Volcano PodGroups for each podTask, allowing them to specify distinct queues and job priorities for scheduling.
What changes were proposed in this pull request?
- The plugin manager for the k8s pod plugin creates Volcano PodGroups before creating the corresponding pods for podTasks. The PodGroup will have the same name as the pod, ensuring unique mapping between PodGroup and Pod. It also adds the PodGroup name to the pod’s annotations, ensuring that the Volcano PodGroup controller does not create the PodGroup again.
- Each PodGroup inherits the queue and priority from its Pod.
- Introduced a feature flag
enable-create-pod-group-for-podin the config to control whether this feature is enabled. The default value is false to ensure backward compatibility. - Regarding PodGroup garbage collection, the PodGroup is deleted by Kubernetes' GC. Since the PodGroup's owner reference points to a FlyteWorkflow, the deletion process follows a cascading effect. When the FlyteWorkflow is deleted, Kubernetes GC removes the associated PodGroup due to the absence of a living owner reference.
How was this patch tested?
Unit tests added Used internally at Linkedin.
Labels
Please add one or more of the following labels to categorize your PR:
- added: For new features.
- changed: For changes in existing functionality.
- deprecated: For soon-to-be-removed features.
- removed: For features being removed.
- fixed: For any bug fixed.
- security: In case of vulnerabilities
This is important to improve the readability of release notes.
Setup process
Screenshots
Check all the applicable boxes
- [x] I updated the documentation accordingly.
- [x] All new and existing tests passed.
- [x] All commits are signed-off.
Related PRs
Docs link
Summary by Bito
This pull request enhances the Volcano scheduler support by allowing the creation of individual PodGroups for each podTask in Propeller, improving scheduling flexibility with distinct queues and job priorities. A new configuration option ensures backward compatibility, and relevant unit tests have been added to validate and verify the new functionality.
Thank you for opening this pull request! 🙌
These tips will help get your PR across the finish line:
- Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
- Sign off your commits (Reference: DCO Guide).
Hi reviewers, could you please help me add the added label to this PR? It seems I don’t have the access. Thank you!
Codecov Report
Attention: Patch coverage is 60.00000% with 32 lines in your changes missing coverage. Please review.
Project coverage is 58.50%. Comparing base (
8e0416d) to head (18b4387). Report is 17 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #6457 +/- ##
==========================================
+ Coverage 58.47% 58.50% +0.02%
==========================================
Files 940 940
Lines 71584 71658 +74
==========================================
+ Hits 41861 41920 +59
- Misses 26540 26551 +11
- Partials 3183 3187 +4
| Flag | Coverage Δ | |
|---|---|---|
| unittests-datacatalog | 59.03% <ø> (ø) |
|
| unittests-flyteadmin | 56.22% <ø> (-0.02%) |
:arrow_down: |
| unittests-flytecopilot | 30.99% <ø> (ø) |
|
| unittests-flytectl | 64.72% <ø> (ø) |
|
| unittests-flyteidl | 76.12% <ø> (ø) |
|
| unittests-flyteplugins | 61.06% <0.00%> (+0.11%) |
:arrow_up: |
| unittests-flytepropeller | 54.81% <62.33%> (+0.02%) |
:arrow_up: |
| unittests-flytestdlib | 64.04% <ø> (ø) |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Without looking to deeply at this I feel that the plugin manager code probably shouldn't be doing exceptional work for a third party Kubernetes scheduler. My intuition is that you would want to create a new plugin for Volcano and configure your Propeller instance to use that as the default plugin.
Without looking to deeply at this I feel that the plugin manager code probably shouldn't be doing exceptional work for a third party Kubernetes scheduler. My intuition is that you would want to create a new plugin for Volcano and configure your Propeller instance to use that as the default plugin.
@Sovietaced Thanks for the review.
All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.
This feature is primarily designed to support the current k8s Pod plugin, and more broadly, any future plugins that launch Pod resources. If I understand correctly, the current code structure assumes each plugin is responsible for managing only one k8s object. This assumption is why a new plugin doesn't work because additional logic is required in the plugin manager to create PodGroups before launching Pods.
For other plugins such as TFJob, PyTorchJob, and MPIJob, their corresponding operators are already integrated with Volcano and can create PodGroups on their own.
All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.
I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.
I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.
All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.
I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.
I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.
I see your point about making the plugin manager code more generic. I'm not sure how much effort it would take, or whether there are concrete plans to rearchitect the Flyte plugins to support building multiple resources. In the meantime, would it be possible to include this change in an earlier release so users can opt in earlier if needed? It could also help validate the effectiveness of a potential future rearchitecture. More feedback from the Flyte core maintainers would also be greatly appreciated.
@zhuo-zhi can we add docs here on how to prepare the K8s cluster for this integration?
@zhuo-zhi can we add docs here on how to prepare the K8s cluster for this integration?
@davidmirror-ops Thanks for the review. I've created a separate PR to document the preparation steps: https://github.com/unionai/docs/pull/372
All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.
I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.
I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.
I agree with everything @Sovietaced said, I also feel we cannot have such specialized logic in the plugin manager itself, even with a feature flag :/
All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.
I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano. I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.
I agree with everything @Sovietaced said, I also feel we cannot have such specialized logic in the plugin manager itself, even with a feature flag :/
@fg91 @Sovietaced Thanks for the insights. I understand your concern. I’m wondering if there are any actionable steps you’d suggest I take to help move the PR forward?
Rearchitecting Flyte to support plugins that build multiple resources appears to be a significant change, as it would likely require refactoring all existing plugins along with the plugin manager. If that’s the only viable path for integration, would it be possible for me to get some support on this?
@fg91 @Sovietaced Thanks for the insights. I understand your concern. I’m wondering if there are any actionable steps you’d suggest I take to help move the PR forward?
I would defer to the Union folks or people that understand the Propeller architecture deeply.
I agree that re-architecting the plugin interface to allow for "extra resources" would be a bigger effort. I don't see a clean way to integrate the creation of resources like PodGroup into Flyte without this though.
The adapted plugin interface could in theory look like this:
type Plugin interface {
// Defines a func to create a query object (typically just object and type meta portions) that's used to query k8s
// resources.
BuildIdentityResource(ctx context.Context, taskCtx pluginsCore.TaskExecutionMetadata) (client.Object, error)
// Defines a func to create the full resource object that will be posted to k8s.
BuildResource(ctx context.Context, taskCtx pluginsCore.TaskExecutionContext) (client.Object, error)
// Analyses the k8s resource and reports the status as TaskPhase. This call is expected to be relatively fast,
// any operations that might take a long time (limits are configured system-wide) should be offloaded to the
// background.
GetTaskPhase(ctx context.Context, pluginContext PluginContext, resource client.Object) (pluginsCore.PhaseInfo, error)
// Properties desired by the plugin
GetProperties() PluginProperties
// New
// Defines a func to create query objects for the extra resources e.g. for aborting
BuildExtraIdentityResources(...) ([]client.Object, error)
// Defines a func to create the full extra resource objects that will be posted to k8s.
BuildExtraResources(...) ([]client.Object, error)
}
Which extra resources would be created would depend on the user configuration for the plugin returned by
GetProperties().
Flytepropeller would not do any state monitoring for those extra resources, just create them along the creation of the main task resource. And it would delete them on abort. In theory we could omit the BuildExtraIdentityResources if propeller injected an owner reference to the main task object into the extra resources so they would get cleaned up automatically if the main task object was deleted. That would leave us with the additional BuildExtraResources.
I don't think the existing plugins would create that much work because their respective BuildExtraResources functions would just do nothing initially. But the creation of the resources (and their deletion or injection of owner refs) would have to be implemented in the k8s plugin.
All this is just brainstorming though, something like this would definitely require an RFC (I could help with the process) and it also definitely would need early input from Union folks like @eapolinario @wild-endeavor, @kumare3.
Alternative approach:
If your main goal was not officially integrating Flyte with Volcano but you/your organization being able to use Flyte pod tasks with Volcano, I think a much lower hanging fruit (albeit not an official integration into flyte) would be to handle the PodGroup creation in a (mutating) admission webhook that gets triggered when flytepropeller creates the pods. You could deploy such a service along side Flyte. I've done something like this in the past for other resources.
Cc @troychiu this was the one