【bug&fixed】vnode status keep "unsynced" when tarbitrator process shuts down even it starts up again
tdengine version: 2.2.1.3
key configurations:
replica 2
numOfMnodes 3
dnodes is 3.
arbitrator is used .
We suppose it comes when :
1、tarbitrator process is shut down (some reason) 【we don't know why it's shut down,and we can recreate this issue by this method. I guess there may be other reason for the slave dnode's 'unsynced' status, like a special occasion in detection and election】
2、one vgroup's master dnode is shut donw/offline,
3、we start up tarbitrator again
4、hours later,slave dnode doesn't transfer status from unsync to master.
status change like:
1: taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | v2_dnode | v2_status | compacting |
=================================================================================================================
7 | 53 | ready | 2 | 3 | slave | 2 | master | 0 |
2:taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | v2_dnode | v2_status | compacting |
=================================================================================================================
7 | 53 | ready | 1 | 3 | unsynced | 2 | master | 0 |
3,4: taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | v2_dnode | v2_status | compacting |
=================================================================================================================
7 | 53 | ready | 0 | 3 | unsynced | 2 | offline | 0 |
***/
// if some node is unsynced(only slave dnode need to transfer), and no master node(master may be offline), we suppose TAOS_SYNC_ROLE_UNSYNCED dnode should have a chance to to find the index
So, we add two flags to judge above abnormal status: unsync_count, if >0, we suppose there is a unsync dnode master_count, if =0, we suppose the master dnode is shut down then, we suppose the unsynced dnode(slave dnode) should have chance to be elected as master.
Test and verify: with modified code, unsynced status transfered to master; with modified code, master dnode keeps 'master' status when slave dnode shutted down (Existing capabilities, unaffected)
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.