OpenSearch icon indicating copy to clipboard operation
OpenSearch copied to clipboard

Master Task Throttling

Open dhwanilpatel opened this issue 3 years ago • 10 comments

Is your feature request related to a problem? Please describe.

For many cluster activities, data nodes submits tasks to master node. Like for put-mapping, create-index, shard started, etc. Sometimes due to some bug or issue Data nodes floods the master node with too many tasks, as a result we can see the spikes in pending task in master queue. This can affect master's performance, which can effect availability of whole cluster.

We should increase master's resiliency against such high pending task.

Describe the solution you'd like

We can make master more resilient by adding throttling of tasks on master node. Master will reject task submitted from data node based on throttling limits. This throttling should work on task type basis, so throttling of one task wont affect different task's submission. Once master rejects such task based on throttling logic, data node will perform retries exponential back off to submit this tasks to master node. We should make dynamic setting for enabling and disabling throttling on master and we should also be able to provide throttling configuration for task types in dynamic setting. This framework will help if there are some bugs/issue in cluster, we can enable throttling for making master resilient against high tasks and disable it when underlying bug/issue gets resolved.

Describe alternatives you've considered

De-duplication of tasks: We have de-duplication framework as well which prevents submitting duplicate tasks to master node, but it wont help for all the cases. Data nodes can submit different tasks and flood master or master gets flooded from customer driven activities as well where tasks wont be duplicate. We want to make master resilient against high pending tasks, so de duplication wont help achieving it.

Additional context

Master performs the batching of tasks, so it iterate over all the task queued in master queue to see whether they can be batched or not, also such tasks will be remain in queue until they are not executed hence it will consume memory as well(memory according to particular task types). So such high pending tasks on master queue can affect CPU/JVM of master node and can affect the availability of whole cluster.

dhwanilpatel avatar Apr 01 '21 10:04 dhwanilpatel

Breaking changes in multiple PR:

  • [ ] Add Master task throttling changed in data/master nodes(#553)
  • [ ] Add Throttling Stats on Stats API
  • [ ] Add Documentation of new Settings.

dhwanilpatel avatar Apr 20 '21 13:04 dhwanilpatel

Hi @dhwanilpatel, are you actively working on this? could please provide some updates?

anasalkouz avatar Nov 08 '21 21:11 anasalkouz

There was an attempt in https://github.com/opensearch-project/OpenSearch/pull/553 to implement this that hasn't been finished. Please feel free to pick it up where it was left!

dblock avatar Mar 07 '22 16:03 dblock

Hello,

I am going to pick this up again to take this changes to completion. Major feedback on last PR (#554) was to break the changes into multiple PRs for ease of review. Below is plan on how I will be breaking changes into multiple PR.

  • [x] Basic Throttler Framework / Exponential Basic back off policy. (#3527 )
  • [x] Changes required in Master node to perform throttling.(#3882)
  • [x] Changes required in Data node to perform retry on throttling.(#4204 )
  • [x] Provide support for all task type in throttling framework.(#4542 )
  • [x] Integration Tests (#4588 )

Below are the list of item for future followup checks,

  • [x] Documentation regarding new settings.
  • [x] Throttling stats in Stats API.

@dblock can you please help in creating the feature branch for this issue, against which we can raise multiple PRs.

dhwanilpatel avatar May 31 '22 11:05 dhwanilpatel

Sorry for the late reply - I think @CEHENKLE has a process for feature branches.

You don't need to wait on me, raise a PR and we can redirect it to a feature branch when it's ready, too.

dblock avatar Jun 14 '22 15:06 dblock

@dhwanilpatel Hey, how's it going? Can we we help at all?

CEHENKLE avatar Jul 21 '22 15:07 CEHENKLE

@CEHENKLE so far it is going as per plan, Data Node and Master Node side changes are in review state. After those PR, upcoming PRs should be straightforward.

Thanks to the reviewers for providing their valuable feedbacks.

dhwanilpatel avatar Aug 12 '22 13:08 dhwanilpatel

@dhwanilpatel Cool beans. LMK if we can help :)

/C

CEHENKLE avatar Aug 23 '22 17:08 CEHENKLE

@dhwanilpatel need a task in here for integ tests as well.

shwetathareja avatar Aug 30 '22 08:08 shwetathareja

@dhwanilpatel I wanted to confirm if this is on track for the 2.3 release. If so, pls add the v2.3.0 label to this issue. Thanks!

elfisher avatar Aug 31 '22 18:08 elfisher

Created followup issue for exposing throttling exception to user : https://github.com/opensearch-project/OpenSearch/issues/4724

dhwanilpatel avatar Oct 10 '22 10:10 dhwanilpatel

@dhwanilpatel given we are calling the feature "Cluster Manager Task Throttling" can we rename this issue "Cluster Manager Task Throttling"?

elfisher avatar Nov 07 '22 20:11 elfisher