volcano
volcano copied to clipboard
Set user customized wait-timeout-seconds for PodGroup based gang scheduling protocol.
What would you like to be added:
Introduce a new filed in podGroup.spec
named with waitTimeoutSeconds
or sth similar, then users are able to configure waitTimeoutSeconds
dynamically.
Why is this needed:
As it described in resource-reservation design doc, this feature is in TODO list, however I can not find a related filed in latest API definition, it helps to balance between large-job-starving anomaly and block too many tasks due to resource reservation, we can scale maximum-reserve-time by job replicas or total requested resource.
Hey, I guess what you want is SLA ensurance. If that, you can take a look at SLA plugin.
@Thor-wl hi, thanks for replying, it seems that SLA
plugin implements the semantics I present above, what make me curious is it coupled with Batch Volcano Job
api ? What if a user submit a job with other api-group along with a podgroup(represents a gang entity), how can volcano guarantees its SLA ?
@Thor-wl hi, thanks for replying, it seems that
SLA
plugin implements the semantics I present above, what make me curious is it coupled withBatch Volcano Job
api ? What if a user submit a job with other api-group along with a podgroup(represents a gang entity), how can volcano guarantees its SLA ?
Yes, this abilitiy is bind to Volcano Job currently. @jiangkaihua Is there any plan to ensure SLA for other workloads?
Yes, I have proposed PR #1961 to solve it. When a user submitted a job with other api-podgroup like replicaset
, daemonset
, etc., k8s would create pods first, then invoke volcano to create podgroup
for the pods. So podgroup
created from k8s pods would miss annotations of origin workloads, causing configurations inserted in the form of annotations neglected, like #1901 .
So my solution is to fetch annotations from upper resources by searching pod ownerReferences
, and filled in podgroup
annotations, so that configurations in annotations would be available for jobs with other api-podgroups.
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
Hello 👋 Looks like there was no activity on this issue for last 90 days. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗 If there will be no activity for 60 days, this issue will be closed (we can always reopen an issue if we need!).
Closing for now as there was no activity for last 60 days after marked as stale, let us know if you need this to be reopened! 🤗