[Discussion] fairshare dws value issues
What would you like to be cleaned: With this fix #6617...
return max(1, int(dws)) // rounds all fractional dws values up to 1
...its opened up a lot of new considerations we should discuss. (Totally understand that this is an initial fix to stop the fairsharing reclamation bug). Let's use this issue to consider a longer term solution:
For a large enough cohort these could be equivalent for fairsharing tournaments:
- borrowing 4, weight 1 | borrowing 100, weight 99
- borrowing 4, weight 1 | borrowing 4, weight 99
How fairshare workload ordering happens today:
// candidatesOrdering criteria:
// 0. Workloads already marked for preemption first.
// 1. Workloads from other ClusterQueues in the cohort before the ones in the
// same ClusterQueue as the preemptor.
// 2. Workloads with lower priority first.
// 3. Workloads admitted more recently first. ⭐️ this one deserves discussion
func CandidatesOrdering(
Problem 1: For dws values that fall in the same int bucket, what is the tiebreaker?
- The primary tiebreaker that's relevant to discuss here is timestamp
- Is this enough considering the scenarios above? (maybe it is)
- If its not enough, should CQ weight somehow be intermingled with timestamp for ordering?
Problem 2: How "similar" is similar for dws value precision bucketing?
- float vs. int probably doesn't matter much, bc we could shift weights
- we should probably use
10Kinstead of1Kfor these to accommodate large weights/cohorts:
ratio := b * 1000 / lr
dws := drs * 1000 / node.fairWeight().MilliValue()
Primarily using this github issue to track this discussion. We'll probably need to break this out into other issues if we decide these problems are worth pursuing.
cc/ @pajakd @tenzen-y @PBundyra @gabesaba
Related https://github.com/kubernetes-sigs/kueue/issues/4247
For the sake of clarity and completeness, there are two algorithms to order candidates for FS:
- For preemption there is the one you mentioned which is used to order workloads within a single CQ
- For ordering workloads that come from different CQs to determine which one should be handled first within a single scheduling cycle: https://github.com/kubernetes-sigs/kueue/blob/main/pkg/scheduler/fair_sharing_iterator.go#L167-L188 The criteria there are:
1. DRF
2. Priority
3. Timestamp
Regrading problems you mentioned:
Ad. 1 I like your proposal with falling back to CQ's weight a lot. Only after the weights are the same I would fall back to to timestamps.
Ad. 2 Please see my proposal here: https://github.com/kubernetes-sigs/kueue/issues/6774#issuecomment-3291706679
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale