ray icon indicating copy to clipboard operation
ray copied to clipboard

[Core][GPU fraction][1/n] Unify node feasibility and availability checking

Open Yicheng-Lu-llll opened this issue 3 weeks ago • 1 comments

Description

Problem Statement

Currently, the scheduler’s node feasibility and availability checks are inconsistent with the actual resource allocation logic. The scheduler reasons only about aggregated GPU capacity per node, while the allocator(local) enforces constraints based on the per-GPU topology.

For example, consider a node with two GPUs, each with 0.2 GPU remaining. The scheduler observes 0.4 GPU available in total and concludes that a actor requesting 0.4 GPU can be placed on this node. However, the allocator(local) rejects the request because no single GPU has 0.4 GPU available.

what this PR does

The high-level goal of this PR is to make node feasibility and availability checks consistent between the scheduler and the resource allocator.

Although the detailed design is still a work in progress and need big refactor, the first step is to make the scheduler’s node feasibility and availability checks itself consistent and centralized.

Right now, Ray has three scheduling paths:

  • Normal task scheduling
  • Normal actor scheduling
  • Placement group
    • Placement Group reservation(scheduling bundle)
    • Task/Actor with Placement Group

Tasks and actors essentially share the same scheduling path and use the same node feasibility and availability check function. Placement group scheduling, however, implements its own logic in certain path, even though it is conceptually the same.

Since we may override or extend the node feasibility and availability checks in later PRs, it is better to first ensure that all scheduling paths use a single, shared implementation of this logic.

This PR addresses that problem.

Related issues

Related to #52133 #54729

Additional information

Here I list all the cases that make sure we are relying on the same node feasibility and availability checking func. Later we can just focusing on changing the func and underlying data structure:

Normal task/actor scheduling:

  • HybridSchedulingPolicy:

    • https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/hybrid_scheduling_policy.cc#L41
    • https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/hybrid_scheduling_policy.cc#L137
  • SpreadSchedulingPolicy:

    • https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/spread_scheduling_policy.cc#L49
    • https://github.com/ray-project/ray/blob/555fab350c1c3179195889c437fe6213b416114c/src/ray/raylet/scheduling/policy/spread_scheduling_policy.cc#L54
  • RandomSchedulingPolicy

    • https://github.com/ray-project/ray/blob/456d1903277668c1f79f3eb230b908a6e6c403a8/src/ray/raylet/scheduling/policy/random_scheduling_policy.cc#L47-L48
  • NodeAffinitySchedulingPolicy

    • Don't care, just schedule to user specified node by default
    • Having fallback option that checks: https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_affinity_scheduling_policy.cc#L26-L30
  • NodeLabelSchedulingPolicy

    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_label_scheduling_policy.cc#L171
    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/node_label_scheduling_policy.cc#L186

Placement Group reservation(scheduling bundle):

  • PACK/SPREAD/STRICT_SPREAD
    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
      • Note, after this PR, it will also be IsAvailable
  • STRICT_SPREAD
    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/scorer.cc#L58
      • Note, after this PR, it will also be IsAvailable
    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/bundle_scheduling_policy.cc#L185

Task/Actor with Placement Group:

  • AffinityWithBundleSchedulingPolicy
    • https://github.com/ray-project/ray/blob/1180868dd4472b444aaffb83a72779adc0dbe1e8/src/ray/raylet/scheduling/policy/affinity_with_bundle_scheduling_policy.cc#L25-L26

Yicheng-Lu-llll avatar Dec 09 '25 00:12 Yicheng-Lu-llll

@ZacAttack @Sparks0219 PTAL too

edoakes avatar Dec 09 '25 16:12 edoakes