[Feature Request] Add a cluster.routing.allocation.balance option that takes shard load into account
Is your feature request related to a problem? Please describe
It is possible for the existing shard rebalance algorithm to put the cluster in a state where the shards are balanced relatively evenly across the nodes in the cluster, however, one of the nodes has shards that are much more actively written to, resulting in a load imbalance.
Describe the solution you'd like
It would be great if, in addition to the current shard and index balance weightings, there was an additional load factor taken into account when balancing decisions are made.
Related component
ShardManagement:Placement
Describe alternatives you've considered
No response
Additional context
No response
[Triage - attendees 1 2 3 4 5 6 7 8] @tronboto Thanks for filing, we would like to review a pull request around this issue.
This would be highly welcome feature. This would help to prevent hotspots in larger clusters and solve many problematic topics in write intensive clusters.
As an example, something similar in elastic: cluster.routing.allocation.balance.write_load
(float, Dynamic) Defines the weight factor for the write load of each shard, in terms of the estimated number of indexing threads needed by the shard. Defaults to 10.0f. Raising this value increases the tendency of Elasticsearch to equalize the total write load across nodes ahead of the other balancing variables.
+1 to this! I've been researching how to solve this very problem of "HOT" shards where some OpenSearch nodes in a cluster will have significantly more load (typically more write load but could also be read load), even as much as double the load when compared to other nodes in the cluster.
I read through all of the Allocation Deciders hoping to unlock some secrets but didn't find anything that would help with this.
Very excited to see this feature!
I strongly support this feature request, as it addresses a significant issue that can arise in bigger OpenSearch clusters. The current shard rebalance does a good job of distributing shards evenly across nodes, but it doesn't really take into account the actual load that those shards might generate. This can lead to situations where certain nodes become hot spots because they are handling much more write activity than others, despite having a similar number of shards.
I have not seen anything that would resolve this problem still. Maybe I am looking at this in a wrong way? In many multitenant environments where different indexes behave very differently, this causes easily situations where some nodes are overloaded and most are idling. The amount of shards could be even, but the load is very different. This has been solved at elastic side already some time ago by the "cluster.routing.allocation.balance.write_load" setting.
Anyone got good ideas how to get around this?
In a balance tool we use internally, I use a linear function based on read/write counts and time (fitted using actual load data) to measure the load of a shard.