tidb
tidb copied to clipboard
Region is not distributed after plug a new server
Hi all, I have a problem related to distribution data in the cluster after i plugged new servers into the current cluster. I have two datacenter in one city. Firstly, i only ran one datacenter (id: 1) with 5 servers. After that I plugged in a new datacenter (id: 2) with 3 servers. I set up leaders to only place in the datacenter 1, so servers of datacenter 2 have no leader. Then, I monitored and saw that data was rebalanced from datacenter 1 to datacenter 2. But there is a problem. Can see below image:

We can easily see node 1.0.1.20 (node in datacenter 1) and node 1.0.0.23 (node in datacenter 2) has more region score than other nodes, although leaders is balance between nodes in cluster.
Before I plugged servers of datacenter 2 to a TiDB cluster of datacenter 1, servers of datacenter 1 have the same region and data size.
Besides, we also triggered the rebalance region but it did not resolve the problem.
My config here:
» config show
{
"replication": {
"enable-placement-rules": "true",
"enable-placement-rules-cache": "false",
"isolation-level": "dc",
"location-labels": "zone,dc,rack,host",
"max-replicas": 5,
"strictly-match-label": "false"
},
"schedule": {
"enable-cross-table-merge": "true",
"enable-joint-consensus": "true",
"high-space-ratio": 0.7,
"hot-region-cache-hits-threshold": 2,
"hot-region-schedule-limit": 8,
"hot-regions-reserved-days": 7,
"hot-regions-write-interval": "10m0s",
"leader-schedule-limit": 4,
"leader-schedule-policy": "size",
"low-space-ratio": 0.8,
"max-merge-region-keys": 200000,
"max-merge-region-size": 20,
"max-pending-peer-count": 64,
"max-snapshot-count": 64,
"max-store-down-time": "30m0s",
"merge-schedule-limit": 8,
"patrol-region-interval": "10ms",
"region-schedule-limit": 2048,
"region-score-formula-version": "v2",
"replica-schedule-limit": 64,
"split-merge-interval": "1h0m0s",
"tolerant-size-ratio": 20
}
}
I think that PD seems to wrongly calculate region score lead to unbalance region score.
I have some questions:
- What is happened problem?
- Why does it happen?
- How to balance data in the cluster? How to fix it?
Thank you.
@tiancaiamao Can you help me?
@tiancaiamao Can you help me?
You can ask question here https://github.com/tikv/pd Region scheduling is pd's work, which I'm not familiar with.
Tks @tiancaiamao