[Core][GPU fraction][1/n] Unify node feasibility and availability checking
Description
Problem Statement
Currently, the scheduler’s node feasibility and availability checks are inconsistent with the actual resource allocation logic. The scheduler reasons only about aggregated GPU capacity per node, while the allocator(local) enforces constraints based on the per-GPU topology.
For example, consider a node with two GPUs, each with 0.2 GPU remaining. The scheduler observes 0.4 GPU available in total and concludes that a actor requesting 0.4 GPU can be placed on this node. However, the allocator(local) rejects the request because no single GPU has 0.4 GPU available.
what this PR does
The high-level goal of this PR is to make node feasibility and availability checks consistent between the scheduler and the resource allocator.
Although the detailed design is still a work in progress and need big refactor, the first step is to make the scheduler’s node feasibility and availability checks itself consistent and centralized.
Right now, Ray has three scheduling paths:
- Normal task scheduling
- Normal actor scheduling
- Placement group
- Placement Group reservation(scheduling bundle)
- Task/Actor with Placement Group
Tasks and actors essentially share the same scheduling path and use the same node feasibility and availability check function. Placement group scheduling, however, implements its own logic in certain path, even though it is conceptually the same.
Since we may override or extend the node feasibility and availability checks in later PRs, it is better to first ensure that all scheduling paths use a single, shared implementation of this logic.
This PR addresses that problem.
Related issues
Related to #52133 #54729
Additional information
Here I list all the cases that make sure we are relying on the same node feasibility and availability checking func. Later we can just focusing on changing the func and underlying data structure:
Normal task/actor scheduling:
-
HybridSchedulingPolicy:
- https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/hybrid_scheduling_policy.cc#L41
- https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/hybrid_scheduling_policy.cc#L137
-
SpreadSchedulingPolicy:
- https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/spread_scheduling_policy.cc#L49
- https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/spread_scheduling_policy.cc#L54
-
RandomSchedulingPolicy
- https://github.com/ray-project/ray/blob/456d1903277668c1f79f3eb230b908a6e6c403a8/src/ray/raylet/scheduling/policy/random_scheduling_policy.cc#L47-L48
-
NodeAffinitySchedulingPolicy
- Don't care, just schedule to user specified node by default
- Having fallback option that checks: https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_affinity_scheduling_policy.cc#L26-L30
-
NodeLabelSchedulingPolicy
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_label_scheduling_policy.cc#L171
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_label_scheduling_policy.cc#L186
Placement Group reservation(scheduling bundle):
- PACK/SPREAD/STRICT_SPREAD
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
- Note, after this PR, it will also be IsAvailable
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
- STRICT_SPREAD
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
- Note, after this PR, it will also be IsAvailable
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/bundle_scheduling_policy.cc#L185
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
Task/Actor with Placement Group:
- AffinityWithBundleSchedulingPolicy
- https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/affinity_with_bundle_scheduling_policy.cc#L25-L26
@ZacAttack @Sparks0219 PTAL too