volcano icon indicating copy to clipboard operation
volcano copied to clipboard

feat: Add crossquota scheduler plugin for multi-resource quota management on GPU nodes

Open kingeasternsun opened this issue 5 months ago • 4 comments

What type of PR is this?

  • [x] Feature
  • [ ] Bug fix
  • [ ] Documentation update
  • [ ] Performance improvement
  • [ ] Refactoring
  • [ ] Test addition
  • [ ] Other

What this PR does / why we need it:

This PR introduces a new Volcano scheduler plugin called crossquota that provides comprehensive resource quota management for non-GPU tasks running on GPU nodes. The plugin extends beyond simple CPU quota management to support multiple resource types including CPU, memory, ephemeral storage, hugepages, and custom resources.

Key Features:

  • Multi-Resource Support: Unlike the previous CPU-only quota plugin, this plugin supports quotas for CPU, memory, ephemeral storage, hugepages, and any custom scalar resources
  • Flexible Configuration: Supports both absolute resource quotas (e.g., "4" cores, "8Gi" memory) and percentage-based quotas (e.g., 50% of node allocatable)
  • Node-Specific Overrides: Allows per-node quota configuration through Kubernetes annotations with higher priority than global plugin configuration
  • GPU Node Detection: Automatically identifies GPU nodes based on configurable resource patterns with regex support
  • Real-time Resource Tracking: Monitors and enforces resource usage on GPU nodes in real-time during scheduling
  • Backward Compatibility: Defaults to CPU-only quotas for existing deployments

Why we need it:

  • Resource Fairness: Ensures fair resource allocation between GPU and non-GPU workloads on shared GPU nodes
  • Cost Optimization: Prevents non-GPU workloads from consuming excessive resources on expensive GPU infrastructure
  • Performance Isolation: Maintains predictable performance for GPU workloads by limiting resource contention
  • Operational Flexibility: Provides both global and node-specific quota management for different deployment scenarios

Configuration Priority (highest to lowest):

  1. Node annotation volcano.sh/crossquota-<resource-name> (absolute quota)
  2. Node annotation volcano.sh/crossquota-percentage-<resource-name> (percentage quota)
  3. Plugin argument quota.<resource-name> (absolute quota)
  4. Plugin argument quota-percentage.<resource-name> (percentage quota)
  5. Full allocatable resource (no quota)

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Implementation Details:

  • The plugin implements the framework.Plugin interface with proper OnSessionOpen and OnSessionClose lifecycle management
  • Resource tracking is done through cloned NodeInfo objects with modified allocatable resources based on quota calculations
  • Predicate functions ensure quota compliance during task scheduling
  • Event handlers maintain real-time resource usage tracking for allocated/deallocated tasks
  • Comprehensive error handling for invalid configurations and resource parsing

Testing:

  • Unit test coverage: 79.7% with comprehensive test cases for all major functionality
  • Tests cover plugin initialization, argument parsing, GPU node detection, CPU pod identification, quota calculation, predicate logic, and event handling
  • Mock implementations for api.Devices interface to support GPU sharing scenarios
  • Helper functions for consistent test data construction

Performance Considerations:

  • Minimal overhead during scheduling as quota calculations are cached per node
  • Efficient resource tracking through direct map lookups
  • Regex compilation is done once during plugin initialization

Does this PR introduce a user-facing change?

feat: Add crossquota scheduler plugin for multi-resource quota management on GPU nodes

- Support for CPU, memory, ephemeral storage, hugepages, and custom resource quotas
- Flexible configuration with both absolute and percentage-based quotas
- Node-specific quota overrides through Kubernetes annotations
- Automatic GPU node detection with regex pattern support
- Real-time resource tracking and enforcement during scheduling
- Backward compatibility with existing CPU quota configurations
  
  Configuration example:
  ```yaml
  actions: "enqueue, allocate, backfill"
  tiers:
  - plugins:
    - name: crossquota
      arguments:
        gpu-resource-names: "nvidia.com/gpu,amd.com/gpu"
        quota-resources: "cpu,memory,ephemeral-storage"
        quota.cpu: "4"
        quota.memory: "8Gi"
        quota-percentage.ephemeral-storage: "50"

Node annotation example:

metadata:
  annotations:
    volcano.sh/crossquota-cpu: "6"
    volcano.sh/crossquota-memory: "12Gi"
    volcano.sh/crossquota-percentage-ephemeral-storage: "75"

kingeasternsun avatar Aug 18 '25 08:08 kingeasternsun

Is this to avoid the CPU load occupying the GPU node resources and causing the GPU task to be unable to be scheduled? If that's true, https://github.com/volcano-sh/volcano/pull/4391 and https://github.com/volcano-sh/volcano/pull/4454 is also a way.

Monokaix avatar Aug 19 '25 01:08 Monokaix

Is this to avoid the CPU load occupying the GPU node resources and causing the GPU task to be unable to be scheduled? If that's true, #4391 and #4454 is also a way.

The ResourceFit strategy falls short of customer requirements, as it only provides a soft preference to steer CPU workloads away from GPU nodes. In contrast, crossquota introduces a hard isolation mechanism: CPU workloads are permitted to run on GPU nodes, but their CPU usage is strictly capped by a predefined quota.

kingeasternsun avatar Aug 19 '25 01:08 kingeasternsun

Wait for this https://github.com/volcano-sh/volcano/pull/4454

kingeasternsun avatar Sep 09 '25 03:09 kingeasternsun

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign jessestutler for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

volcano-sh-bot avatar Nov 04 '25 01:11 volcano-sh-bot