feat: Add crossquota scheduler plugin for multi-resource quota management on GPU nodes
What type of PR is this?
- [x] Feature
- [ ] Bug fix
- [ ] Documentation update
- [ ] Performance improvement
- [ ] Refactoring
- [ ] Test addition
- [ ] Other
What this PR does / why we need it:
This PR introduces a new Volcano scheduler plugin called crossquota that provides comprehensive resource quota management for non-GPU tasks running on GPU nodes. The plugin extends beyond simple CPU quota management to support multiple resource types including CPU, memory, ephemeral storage, hugepages, and custom resources.
Key Features:
- Multi-Resource Support: Unlike the previous CPU-only quota plugin, this plugin supports quotas for CPU, memory, ephemeral storage, hugepages, and any custom scalar resources
- Flexible Configuration: Supports both absolute resource quotas (e.g., "4" cores, "8Gi" memory) and percentage-based quotas (e.g., 50% of node allocatable)
- Node-Specific Overrides: Allows per-node quota configuration through Kubernetes annotations with higher priority than global plugin configuration
- GPU Node Detection: Automatically identifies GPU nodes based on configurable resource patterns with regex support
- Real-time Resource Tracking: Monitors and enforces resource usage on GPU nodes in real-time during scheduling
- Backward Compatibility: Defaults to CPU-only quotas for existing deployments
Why we need it:
- Resource Fairness: Ensures fair resource allocation between GPU and non-GPU workloads on shared GPU nodes
- Cost Optimization: Prevents non-GPU workloads from consuming excessive resources on expensive GPU infrastructure
- Performance Isolation: Maintains predictable performance for GPU workloads by limiting resource contention
- Operational Flexibility: Provides both global and node-specific quota management for different deployment scenarios
Configuration Priority (highest to lowest):
- Node annotation
volcano.sh/crossquota-<resource-name>(absolute quota) - Node annotation
volcano.sh/crossquota-percentage-<resource-name>(percentage quota) - Plugin argument
quota.<resource-name>(absolute quota) - Plugin argument
quota-percentage.<resource-name>(percentage quota) - Full allocatable resource (no quota)
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Implementation Details:
- The plugin implements the
framework.Plugininterface with properOnSessionOpenandOnSessionCloselifecycle management - Resource tracking is done through cloned
NodeInfoobjects with modified allocatable resources based on quota calculations - Predicate functions ensure quota compliance during task scheduling
- Event handlers maintain real-time resource usage tracking for allocated/deallocated tasks
- Comprehensive error handling for invalid configurations and resource parsing
Testing:
- Unit test coverage: 79.7% with comprehensive test cases for all major functionality
- Tests cover plugin initialization, argument parsing, GPU node detection, CPU pod identification, quota calculation, predicate logic, and event handling
- Mock implementations for
api.Devicesinterface to support GPU sharing scenarios - Helper functions for consistent test data construction
Performance Considerations:
- Minimal overhead during scheduling as quota calculations are cached per node
- Efficient resource tracking through direct map lookups
- Regex compilation is done once during plugin initialization
Does this PR introduce a user-facing change?
feat: Add crossquota scheduler plugin for multi-resource quota management on GPU nodes
- Support for CPU, memory, ephemeral storage, hugepages, and custom resource quotas
- Flexible configuration with both absolute and percentage-based quotas
- Node-specific quota overrides through Kubernetes annotations
- Automatic GPU node detection with regex pattern support
- Real-time resource tracking and enforcement during scheduling
- Backward compatibility with existing CPU quota configurations
Configuration example:
```yaml
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: crossquota
arguments:
gpu-resource-names: "nvidia.com/gpu,amd.com/gpu"
quota-resources: "cpu,memory,ephemeral-storage"
quota.cpu: "4"
quota.memory: "8Gi"
quota-percentage.ephemeral-storage: "50"
Node annotation example:
metadata:
annotations:
volcano.sh/crossquota-cpu: "6"
volcano.sh/crossquota-memory: "12Gi"
volcano.sh/crossquota-percentage-ephemeral-storage: "75"
Is this to avoid the CPU load occupying the GPU node resources and causing the GPU task to be unable to be scheduled? If that's true, https://github.com/volcano-sh/volcano/pull/4391 and https://github.com/volcano-sh/volcano/pull/4454 is also a way.
Is this to avoid the CPU load occupying the GPU node resources and causing the GPU task to be unable to be scheduled? If that's true, #4391 and #4454 is also a way.
The ResourceFit strategy falls short of customer requirements, as it only provides a soft preference to steer CPU workloads away from GPU nodes. In contrast, crossquota introduces a hard isolation mechanism: CPU workloads are permitted to run on GPU nodes, but their CPU usage is strictly capped by a predefined quota.
Wait for this https://github.com/volcano-sh/volcano/pull/4454
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign jessestutler for approval. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment