nebula
nebula copied to clipboard
[Bug] leader balance don't work well
Describe the bug (required)
In our cluster, there are 8 hosts, and each host has 54 partitions, as the replica factor is 3, each host should have 18 leaders on average. However, after leader balance, the leader distribution is 15, 18, 18, 18, 18, 19, 19, 19 on different hosts, for example, the hosts is h0, h1, h2, h3, h4, h5, h6, h7. I think the balance result is not good enough, can we try to balance and make each host 18 leaders?
More information is that, the partition peers of h0 is only h1, h2, h3, h4, the 4 hosts have 18 leaders each.
Leader balance code is here,
it seems that, when h0 wants to get a leader from h1, h2, h3 or h4, it will be failed, as the condition "minLoad < sourceLeaders.size()" is not met.
So, maybe we need a better strategy for leader balance, for example, we may need to consider more when doing leader balance, instead of only focusing on partition's peers, but the whole cluster.
Your Environments (required)
- OS:
uname -a
- Compiler:
g++ --version
orclang++ --version
- CPU:
lscpu
- Commit id (e.g.
a3ffc7d8
)
How To Reproduce(required)
Steps to reproduce the behavior:
- Step 1
- Step 2
- Step 3
Expected behavior
Additional context
it seems that, when h0 wants to get a leader from h1, h2, h3 or h4, it will be failed, as the condition "minLoad < sourceLeaders.size()" is not met.
In your example, what is the minLoad of h0, 18?
I think the scenario you describe do exists, h0 only has overlaps with h1, h2, h3, h4, but they all have 18 leaders.
But do we really need to make it perfect 18?
it seems that, when h0 wants to get a leader from h1, h2, h3 or h4, it will be failed, as the condition "minLoad < sourceLeaders.size()" is not met.
In your example, what is the minLoad of h0, 18?
Yes, minLoad is 18, maxLoad is 19
I think the scenario you describe do exists, h0 only has overlaps with h1, h2, h3, h4, but they all have 18 leaders.
But do we really need to make it perfect 18?
When the cluster has high access pressure, for example, the server's CPU usage is nearly full, the client will receive much error as one or more machines have higher pressure, but other machines may still have buffer.
I think if each server's leader is perfect 18, it'll be better, and if it can be done easily, I think there is no harm, so, it's a good thing to do it.
@wey-gu
We are observing this imbalance in v3.6.0, below is our cluster info: metad: 3 graphd: 3 storaged: 7 replicaFactor: 3 No of partitions: 140
After several BALANCE LEADER attempts
Expected leader distribution: 20, 20, 20, 20, 20, 20, 20
Actual leader distribution: 26, 26, 27, 15, 14, 17, 15
@songqing you have only 8 hosts; aren't you supposed to have odd number of hosts for Raft?
@wey-gu
We are observing this imbalance in v3.6.0, below is our cluster info: metad: 3 graphd: 3 storaged: 7 replicaFactor: 3 No of partitions: 140
After several BALANCE LEADER attempts
Expected leader distribution: 20, 20, 20, 20, 20, 20, 20 Actual leader distribution: 26, 26, 27, 15, 14, 17, 15
@songqing you have only 8 hosts; are your supposed to have odd number of hosts for Raft?
I think host number has nothing to do with the leader distribution, both odd number and even number are ok. The leader balance algo is the key problem.
@songqing you have only 8 hosts; are your supposed to have odd number of hosts for Raft?
I think host number has nothing to do with the leader distribution, both odd number and even number are ok. The leader balance algo is the key problem.
Maybe for distribution, but aren't you supposed to have odd number of hosts?
In any case, this leader imbalance effecting the perf very badly on huge graph. Our space has total Vertices Count: 2.8 Billion total Edges Count: 1 Billon
@songqing you have only 8 hosts; are your supposed to have odd number of hosts for Raft?
I think host number has nothing to do with the leader distribution, both odd number and even number are ok. The leader balance algo is the key problem.
Maybe for distribution, but aren't you supposed to have odd number of hosts?
In any case, this leader imbalance effecting the perf very badly on huge graph. Our space has total Vertices Count: 2.8 Billion total Edges Count: 1 Billon
Metad hosts' number should be odd, storaged's has no this limitation I think
Yes, we could have even numbers of storage hosts, the things to be odd should be the replica factor for spaces.