dolphinscheduler icon indicating copy to clipboard operation
dolphinscheduler copied to clipboard

[DSIP-98][Worker] Add per-task-type maximum concurrency limits on Workers

Open ruanwenjun opened this issue 1 month ago • 7 comments

Search before asking

  • [x] I had searched in the DSIP and found no similar DSIP.

Motivation

todo:

Design Detail

todo:

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

ruanwenjun avatar Nov 20 '25 01:11 ruanwenjun

Can you confirm if my understanding is correct? For example, suppose we have worker1 and worker2, and worker1 has more resources (such as CPU and memory) compared to worker2.

In such an unbalanced cluster environment, I can set different concurrency limits for worker1 and worker2. For instance, on worker1, the concurrency for shell tasks can be set to 10, and for SQL tasks, it can be set to 5. Meanwhile, since worker2 has fewer resources, the concurrency for shell tasks on worker2 can be set to 5, and for SQL tasks, it can be set to 2.

Mrhs121 avatar Nov 24 '25 07:11 Mrhs121

Can you confirm if my understanding is correct? For example, suppose we have worker1 and worker2, and worker1 has more resources (such as CPU and memory) compared to worker2.

In such an unbalanced cluster environment, I can set different concurrency limits for worker1 and worker2. For instance, on worker1, the concurrency for shell tasks can be set to 10, and for SQL tasks, it can be set to 5. Meanwhile, since worker2 has fewer resources, the concurrency for shell tasks on worker2 can be set to 5, and for SQL tasks, it can be set to 2.

Yes, this is a use case. However, our worker node configurations are typically identical. We aim to support varying levels of concurrency for different tasks—for instance, SQL/HTTP tasks can handle higher concurrency, while shell tasks support lower concurrency.

ruanwenjun avatar Nov 24 '25 14:11 ruanwenjun

I think this feature is great. Our tasks are mostly Seatunnel + SQL, and we need to control Seatunnel concurrency without limiting SQL concurrency.

Zzih avatar Nov 25 '25 07:11 Zzih

TaskGroup can also be used to control the concurrency of task. This new feature may have some conflict with the existing TaskGroup

Mrhs121 avatar Nov 25 '25 08:11 Mrhs121

TaskGroup can also be used to control the concurrency of task. This new feature may have some conflict with the existing TaskGroup

It can be extended based on the existing TaskGroup functionality.The existing TaskGroup functionality is tied to the project, and I don't understand why it's designed this way. I think it's a bad design.

Zzih avatar Nov 25 '25 08:11 Zzih

For example, like this?

Image Image

Mrhs121 avatar Nov 25 '25 09:11 Mrhs121

TaskGroup can also be used to control the concurrency of task. This new feature may have some conflict with the existing TaskGroup

TaskGroup feels rather peculiar to use. In most cases, we use worker groups to partition resources and manage concurrency. It is difficult to enforce users to set TaskGroups for tasks, but tasks must utilize worker groups and workers.

ruanwenjun avatar Nov 25 '25 14:11 ruanwenjun