dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[Feature][ResourceOversubscriptionManager] Improving resource oversubscription handling in Apache DolphinScheduler

Open kito4 opened this issue 1 month ago • 4 comments

Search before asking

  • [x] I had searched in the issues and found no similar feature requirement.

Description

As part of an educational research project at ITMO University, we aim to investigate how open-source schedulers, specifically Apache DolphinScheduler (DS), handle resource oversubscription. Oversubscription — allocating more tasks than available physical or logical resources — can increase utilization and reduce costs but often leads to performance degradation, instability, or SLA violations for critical workloads. The project will focus on identifying technical gaps in DS and proposing mechanisms to manage oversubscription safely, including metrics, scheduling policies, prioritization, and throttling strategies.

Use case

A DS cluster runs multiple concurrent workflows, temporarily exceeding available CPU, memory, or I/O resources. Without proper control, worker nodes may become overloaded, task queues grow, and critical tasks may fail or be delayed. The research project will explore potential solutions such as: Prioritizing critical workflows under oversubscription. Implementing back-pressure or throttling mechanisms. Adding observability and metrics for oversubscription states. Testing and simulating scenarios to evaluate improvements in throughput, latency, and stability.

Related issues

No response

Are you willing to submit a PR?

  • [x] Yes I am willing to submit a PR!

Code of Conduct

kito4 avatar Nov 09 '25 22:11 kito4

Please provide the actual production problems you want to solve and the detailed design scheme. I don't understand what this issue wants to do.

SbloodyS avatar Nov 10 '25 01:11 SbloodyS

In production environments, multiple workflows often reserve more resources (CPU, RAM) than they actually use. For example, several tasks each declare 8 GB RAM but only consume 1–2 GB on average. As a result, cluster utilization stays low even though no new workflows can be scheduled — because declared resources exceed physical capacity. To improve efficiency, we can apply controlled resource oversubscription: temporarily allocating more logical resources than physically available, based on real usage metrics. However, DolphinScheduler currently lacks mechanisms to monitor real-time utilization or to manage safe oversubscription without risking node overload or instability.

Key Components: ResourceMonitorAnalyzer — processes real-time CPU and memory data already reported by worker heartbeats and monitoring controllers. OversubscriptionController — calculates oversubscription ratio and decides whether to allow or delay task dispatch. PolicyEngine — defines prioritization and throttling rules under oversubscription. MetricsReporter — exports oversubscription metrics to existing metrics framework (Prometheus, REST API).

Workflow 1Each worker periodically reports actual resource usage (usedCPU, usedMemory). 2 The OversubscriptionController calculates: oversubscription_ratio = (allocated_resources / physical_resources) utilization_rate = (used_resources / physical_resources) If utilization_rate < threshold (e.g., 60%), new tasks can be accepted even if allocated > 100%. If utilization_rate > safety limit (e.g., 90%), controller triggers back-pressure and suspends new task dispatch. Tasks can be prioritized based on workflow class (CRITICAL > NORMAL > BEST_EFFORT) Configuration Parameters maxOversubscriptionFactor (Maximum ratio of allocated to physical resources allowed ) 1.5 lowUtilizationThreshold (CPU/memory usage below which oversubscription is safe) 60% highUtilizationThreshold ( Utilization above which task submission is throttled) 90% priorityMode ( Workflow scheduling priority mode ) NORMAL

kito4 avatar Nov 10 '25 12:11 kito4

Server load protection was implemented in a long time ago. And prometheus metrics is also implemented in version 3.X. Which version are you using?

SbloodyS avatar Nov 10 '25 13:11 SbloodyS

Can i write loadbalancer with CPU and ThreadPool Oversubscription ? In DynamicWeightedRoundRobinWorkerLoadBalancer we have weights for each worker and it is never more than 100 ! What if make possible weight exceed that value ? Tasks may take longer individually, but the overall throughput for a group of tasks could improve, especially when some resources are underutilized Memory oversubscription may be risky; this feature is intended for CPU and ThreadPool only.

private double calculateWeight(WorkerServerMetadata server) {
                double load =
                        dynamicWeightConfigProperties.getCpuUsageWeight() * server.getCpuUsage()
                                + dynamicWeightConfigProperties.getMemoryUsageWeight() * server.getMemoryUsage()
                                + dynamicWeightConfigProperties.getTaskThreadPoolUsageWeight()
                                        * server.getTaskThreadPoolUsage();

                load = load / 3.0;

                double osFactor = dynamicWeightConfigProperties.getOversubscriptionFactor();
                double maxWeight = 100 * osFactor;

                double weight = maxWeight - load;

                return Math.max(weight, 0.0);
            }

kito4 avatar Dec 10 '25 22:12 kito4

@kito4 Do you mean setting a parameter to prevent workers from reducing their potential load? In my view, this may not necessarily be genuinely useful and could lead to more complex configurations. It would be difficult for us to set an effective parameter.

ruanwenjun avatar Dec 14 '25 03:12 ruanwenjun

@kito4 Do you mean setting a parameter to prevent workers from reducing their potential load? In my view, this may not necessarily be genuinely useful and could lead to more complex configurations. It would be difficult for us to set an effective parameter.

+1

SbloodyS avatar Dec 18 '25 06:12 SbloodyS